HyperAI

Main

GPU

Console
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Wiki

Wiki

Machine Learning Glossary: Explore definitions and explanations of key AI and ML concepts

Build the Future of Artificial Intelligence

About

About Us Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

HyperAI

Main

GPU

Console
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Wiki

Wiki

Machine Learning Glossary: Explore definitions and explanations of key AI and ML concepts

Build the Future of Artificial Intelligence

About

About Us Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

Bi-directional Long-Short Term Memory/Bi-LSTM

Definition Deep neural networks have shown superior results in many fields such as speech recognition, image processing, and natural language processing. LSTM, as a variant of RNN, can learn long-term dependencies in data compared to RNN. In 2005, Graves proposed combining LSTM with […]

Bias-Variance Dilemma

The bias-variance dilemma means that it is impossible to reduce both bias and variance at the same time, and you can only strike a balance between the two. In a model, if you want to reduce bias, you will increase the complexity of the model to prevent underfitting; but at the same time, you cannot make the model too complex to increase variance and cause overfitting. Therefore, you need to find a balance in the complexity of the model, which can […]

Bias-variance Decomposition

Bias-variance decomposition is a tool to explain the generalization performance of learning algorithms from the perspective of bias and variance. The specific definition is as follows: Assume that there are K data sets, each of which is independently drawn from a distribution p(t,x) (t represents the variable to be predicted, x represents the feature variable). In different […]

Bias

Definition: The difference between the expected output and the true label is called bias. The following figure can well illustrate the relationship between bias and variance:

Between-class Scatter Matrix

The inter-class scatter matrix is used to represent the distribution of each sample point around the mean. Mathematical definition

Bayesian Network

Definition Bayesian network is one of the most effective theoretical models in the field of uncertain knowledge expression and reasoning. Bayesian network consists of nodes representing variables and directed edges connecting these nodes. Nodes represent random variables, and directed edges between nodes represent the relationship between nodes. The strength of the relationship is expressed by conditional probability. There is no parent node […]

Bayesian Decision Theory

Basic Concepts Bayesian decision theory is a basic method in statistical model decision making. Its basic idea is: Known class conditional probability density parameter expression and prior probability are converted into posterior probability using Bayesian formula. Decision classification is made based on the size of the posterior probability. Related formula Let D1, D2, ..., Dn be samples […]

Bayes Optimal Classifier

In order to minimize the overall risk, the class label that can minimize the risk R(c|x) on the sample is selected, that is, h∗ is the Bayesian optimal classifier.

Bayes Model Averaging/BMA

In model selection, one "best" model is usually selected from a set of candidate models, and then this selected "best" model is used for prediction. Unlike a single best model, Bayesian model averaging assigns weights to each model and performs weighted averaging to determine the final prediction value. The weight assigned to a model is […]

Bayes Decision Rule

For each sample x, if h can minimize the conditional risk R(h(x)|x), then the overall risk will also be minimized. This gives rise to the Bayes decision rule: to minimize the overall risk, we only need to choose the one that can make the conditional risk R(c|x […]

BN Batch Normalization

BN is a set of regularization methods that can speed up the training of large convolutional networks and improve the classification accuracy after convergence. When BN is used in a certain layer of a neural network, it will standardize the internal data of each mini-batch so that the output is normalized to the normal distribution of N(0,1), reducing […]

Base Learning Algorithm

In ensemble learning, the "individual learners" generated by group are homogeneous. Such learners are called base learners, and the corresponding learning algorithms are called base learning algorithms.

Long Short-Term Memory

Long Short-Term Memory (LSTM) is a time recurrent neural network (RNN) that was first published in 1997. Due to its unique design structure, LSTM is suitable for processing and predicting important events in time series with very long intervals and delays. […]

Information Entropy

Information entropy is a quantity that measures the amount of information. It was proposed by Shannon in 1948. It borrowed the concept of entropy in thermodynamics and called the average amount of information after excluding redundancy in information information entropy. It also gave a related mathematical expression. Three properties of information entropy Monotonicity: The higher the probability of an event, the more information it carries […]

Knowledge Representation

Knowledge representation refers to the representation and description of knowledge. It is concerned with how agents can reasonably use relevant knowledge. This is a study of thinking as a computational process. Strictly speaking, knowledge representation and knowledge reasoning are two closely related concepts in the same research field, but in fact, knowledge representation is also used to refer to the broad concept of reasoning.

Exponential Loss Function

Exponential loss function is a commonly used loss function in AdaBoost algorithm. Its function expression is in exponential form. The schematic diagram is as follows. Common loss error Exponential loss Exponential Loss: Mainly used in Adaboost ensemble learning algorithm; Hinge loss H […]

Ground-truth

In the field of machine learning, truth refers to the accurate set value of the training set for the classification result in supervised learning, which is generally used for error estimation and effect evaluation. In supervised learning, the labeled data usually appears in the form of (x, t), where x represents the input data and t represents the label. The correct label is Grou […]

Error-ambiguity Decomposition

Error-divergence decomposition refers to the process of decomposing the integrated generalization error, which can be expressed as follows: ${E= \overline {E}- \overline {A}}$ where the left side E represents the integrated generalization error, and the right side $latex {\over […]

Markov Chain Monte Carlo Method MCMC

MCMC is an algorithm for sampling from random distributions based on Markov chains. It approximates the posterior distribution of the parameter of interest by randomly sampling in the probability space. The basic theory of MCMC is the Markov process. In related algorithms, in order to sample from a specified distribution, we can simulate the Markov process from any state first.

Evolutionary Computation

Evolutionary algorithm is a general problem-solving method that draws on the natural selection and natural genetic mechanisms of the biological world. Basic method: Use simple coding technology to represent various complex structures, use simple genetic operations and natural selection of survival of the fittest to guide learning and determine the search direction; Use population to organize the search, so that […]

Genetic Algorithm

Genetic algorithm (GA) is a search algorithm used in computational mathematics to solve optimization problems. It is a type of evolutionary algorithm. Evolutionary algorithms originally borrowed from some phenomena in evolutionary biology, including inheritance, mutation, natural selection, and hybridization. Genetic algorithms are usually implemented in the form of computer simulation. For an optimization problem, if there are […]

Gain Ratio

Gain ratio usually refers to information gain ratio, which represents the ratio of node information to node split information metric. Gain ratio is usually used as one of the attribute selection methods. The other two common methods are information gain and Gini index. The gain ratio formula is as follows: $latex {GainRatio{ \left( {R} […]

Hilbert Space

Hilbert space is a complete inner product space, which can be understood as a complete vector space with inner product. Hilbert space is based on finite-dimensional Euclidean space and can be seen as a generalization of the latter. It is not limited to real numbers and finite dimensions, but it is not complete. Like Euclidean space, Hilbert space is also an inner product space, and has distance and angle […]

Hidden Markov Model

Hidden Markov Model (HMM) is a probability model about time series, which describes the process of generating an observable random sequence of states from each state. Hidden Markov Model is a statistical model that is used to describe a Markov chain with hidden unknown parameters.

Bi-directional Long-Short Term Memory/Bi-LSTM

Definition Deep neural networks have shown superior results in many fields such as speech recognition, image processing, and natural language processing. LSTM, as a variant of RNN, can learn long-term dependencies in data compared to RNN. In 2005, Graves proposed combining LSTM with […]

Bias-Variance Dilemma

The bias-variance dilemma means that it is impossible to reduce both bias and variance at the same time, and you can only strike a balance between the two. In a model, if you want to reduce bias, you will increase the complexity of the model to prevent underfitting; but at the same time, you cannot make the model too complex to increase variance and cause overfitting. Therefore, you need to find a balance in the complexity of the model, which can […]

Bias-variance Decomposition

Bias-variance decomposition is a tool to explain the generalization performance of learning algorithms from the perspective of bias and variance. The specific definition is as follows: Assume that there are K data sets, each of which is independently drawn from a distribution p(t,x) (t represents the variable to be predicted, x represents the feature variable). In different […]

Bias

Definition: The difference between the expected output and the true label is called bias. The following figure can well illustrate the relationship between bias and variance:

Between-class Scatter Matrix

The inter-class scatter matrix is used to represent the distribution of each sample point around the mean. Mathematical definition

Bayesian Network

Definition Bayesian network is one of the most effective theoretical models in the field of uncertain knowledge expression and reasoning. Bayesian network consists of nodes representing variables and directed edges connecting these nodes. Nodes represent random variables, and directed edges between nodes represent the relationship between nodes. The strength of the relationship is expressed by conditional probability. There is no parent node […]

Bayesian Decision Theory

Basic Concepts Bayesian decision theory is a basic method in statistical model decision making. Its basic idea is: Known class conditional probability density parameter expression and prior probability are converted into posterior probability using Bayesian formula. Decision classification is made based on the size of the posterior probability. Related formula Let D1, D2, ..., Dn be samples […]

Bayes Optimal Classifier

In order to minimize the overall risk, the class label that can minimize the risk R(c|x) on the sample is selected, that is, h∗ is the Bayesian optimal classifier.

Bayes Model Averaging/BMA

In model selection, one "best" model is usually selected from a set of candidate models, and then this selected "best" model is used for prediction. Unlike a single best model, Bayesian model averaging assigns weights to each model and performs weighted averaging to determine the final prediction value. The weight assigned to a model is […]

Bayes Decision Rule

For each sample x, if h can minimize the conditional risk R(h(x)|x), then the overall risk will also be minimized. This gives rise to the Bayes decision rule: to minimize the overall risk, we only need to choose the one that can make the conditional risk R(c|x […]

BN Batch Normalization

BN is a set of regularization methods that can speed up the training of large convolutional networks and improve the classification accuracy after convergence. When BN is used in a certain layer of a neural network, it will standardize the internal data of each mini-batch so that the output is normalized to the normal distribution of N(0,1), reducing […]

Base Learning Algorithm

In ensemble learning, the "individual learners" generated by group are homogeneous. Such learners are called base learners, and the corresponding learning algorithms are called base learning algorithms.

Long Short-Term Memory

Long Short-Term Memory (LSTM) is a time recurrent neural network (RNN) that was first published in 1997. Due to its unique design structure, LSTM is suitable for processing and predicting important events in time series with very long intervals and delays. […]

Information Entropy

Information entropy is a quantity that measures the amount of information. It was proposed by Shannon in 1948. It borrowed the concept of entropy in thermodynamics and called the average amount of information after excluding redundancy in information information entropy. It also gave a related mathematical expression. Three properties of information entropy Monotonicity: The higher the probability of an event, the more information it carries […]

Knowledge Representation

Knowledge representation refers to the representation and description of knowledge. It is concerned with how agents can reasonably use relevant knowledge. This is a study of thinking as a computational process. Strictly speaking, knowledge representation and knowledge reasoning are two closely related concepts in the same research field, but in fact, knowledge representation is also used to refer to the broad concept of reasoning.

Exponential Loss Function

Exponential loss function is a commonly used loss function in AdaBoost algorithm. Its function expression is in exponential form. The schematic diagram is as follows. Common loss error Exponential loss Exponential Loss: Mainly used in Adaboost ensemble learning algorithm; Hinge loss H […]

Ground-truth

In the field of machine learning, truth refers to the accurate set value of the training set for the classification result in supervised learning, which is generally used for error estimation and effect evaluation. In supervised learning, the labeled data usually appears in the form of (x, t), where x represents the input data and t represents the label. The correct label is Grou […]

Error-ambiguity Decomposition

Error-divergence decomposition refers to the process of decomposing the integrated generalization error, which can be expressed as follows: ${E= \overline {E}- \overline {A}}$ where the left side E represents the integrated generalization error, and the right side $latex {\over […]

Markov Chain Monte Carlo Method MCMC

MCMC is an algorithm for sampling from random distributions based on Markov chains. It approximates the posterior distribution of the parameter of interest by randomly sampling in the probability space. The basic theory of MCMC is the Markov process. In related algorithms, in order to sample from a specified distribution, we can simulate the Markov process from any state first.

Evolutionary Computation

Evolutionary algorithm is a general problem-solving method that draws on the natural selection and natural genetic mechanisms of the biological world. Basic method: Use simple coding technology to represent various complex structures, use simple genetic operations and natural selection of survival of the fittest to guide learning and determine the search direction; Use population to organize the search, so that […]

Genetic Algorithm

Genetic algorithm (GA) is a search algorithm used in computational mathematics to solve optimization problems. It is a type of evolutionary algorithm. Evolutionary algorithms originally borrowed from some phenomena in evolutionary biology, including inheritance, mutation, natural selection, and hybridization. Genetic algorithms are usually implemented in the form of computer simulation. For an optimization problem, if there are […]

Gain Ratio

Gain ratio usually refers to information gain ratio, which represents the ratio of node information to node split information metric. Gain ratio is usually used as one of the attribute selection methods. The other two common methods are information gain and Gini index. The gain ratio formula is as follows: $latex {GainRatio{ \left( {R} […]

Hilbert Space

Hilbert space is a complete inner product space, which can be understood as a complete vector space with inner product. Hilbert space is based on finite-dimensional Euclidean space and can be seen as a generalization of the latter. It is not limited to real numbers and finite dimensions, but it is not complete. Like Euclidean space, Hilbert space is also an inner product space, and has distance and angle […]

Hidden Markov Model

Hidden Markov Model (HMM) is a probability model about time series, which describes the process of generating an observable random sequence of states from each state. Hidden Markov Model is a statistical model that is used to describe a Markov chain with hidden unknown parameters.