Command Palette
Search for a command to run...
Wiki
Machine Learning Glossary: Explore definitions and explanations of key AI and ML concepts
Search for a command to run...
Machine Learning Glossary: Explore definitions and explanations of key AI and ML concepts
Search for a command to run...
Machine Learning Glossary: Explore definitions and explanations of key AI and ML concepts
Bias-variance decomposition is a tool to explain the generalization performance of learning algorithms from the perspective of bias and variance. The specific definition is as follows: Assume that there are K data sets, each of which is independently drawn from a distribution p(t,x) (t represents the variable to be predicted, x represents the feature variable). In different […]
Definition: The difference between the expected output and the true label is called bias. The following figure can well illustrate the relationship between bias and variance:
The inter-class scatter matrix is used to represent the distribution of each sample point around the mean. Mathematical definition
Definition Bayesian network is one of the most effective theoretical models in the field of uncertain knowledge expression and reasoning. Bayesian network consists of nodes representing variables and directed edges connecting these nodes. Nodes represent random variables, and directed edges between nodes represent the relationship between nodes. The strength of the relationship is expressed by conditional probability. There is no parent node […]
Basic Concepts Bayesian decision theory is a basic method in statistical model decision making. Its basic idea is: Known class conditional probability density parameter expression and prior probability are converted into posterior probability using Bayesian formula. Decision classification is made based on the size of the posterior probability. Related formula Let D1, D2, ..., Dn be samples […]
In order to minimize the overall risk, the class label that can minimize the risk R(c|x) on the sample is selected, that is, h∗ is the Bayesian optimal classifier.
In model selection, one "best" model is usually selected from a set of candidate models, and then this selected "best" model is used for prediction. Unlike a single best model, Bayesian model averaging assigns weights to each model and performs weighted averaging to determine the final prediction value. The weight assigned to a model is […]
For each sample x, if h can minimize the conditional risk R(h(x)|x), then the overall risk will also be minimized. This gives rise to the Bayes decision rule: to minimize the overall risk, we only need to choose the one that can make the conditional risk R(c|x […]
BN is a set of regularization methods that can speed up the training of large convolutional networks and improve the classification accuracy after convergence. When BN is used in a certain layer of a neural network, it will standardize the internal data of each mini-batch so that the output is normalized to the normal distribution of N(0,1), reducing […]
In ensemble learning, the "individual learners" generated by group are homogeneous. Such learners are called base learners, and the corresponding learning algorithms are called base learning algorithms.
Long Short-Term Memory (LSTM) is a time recurrent neural network (RNN) that was first published in 1997. Due to its unique design structure, LSTM is suitable for processing and predicting important events in time series with very long intervals and delays. […]
Information entropy is a quantity that measures the amount of information. It was proposed by Shannon in 1948. It borrowed the concept of entropy in thermodynamics and called the average amount of information after excluding redundancy in information information entropy. It also gave a related mathematical expression. Three properties of information entropy Monotonicity: The higher the probability of an event, the more information it carries […]
Knowledge representation refers to the representation and description of knowledge. It is concerned with how agents can reasonably use relevant knowledge. This is a study of thinking as a computational process. Strictly speaking, knowledge representation and knowledge reasoning are two closely related concepts in the same research field, but in fact, knowledge representation is also used to refer to the broad concept of reasoning.
Exponential loss function is a commonly used loss function in AdaBoost algorithm. Its function expression is in exponential form. The schematic diagram is as follows. Common loss error Exponential loss Exponential Loss: Mainly used in Adaboost ensemble learning algorithm; Hinge loss H […]
In the field of machine learning, truth refers to the accurate set value of the training set for the classification result in supervised learning, which is generally used for error estimation and effect evaluation. In supervised learning, the labeled data usually appears in the form of (x, t), where x represents the input data and t represents the label. The correct label is Grou […]
Error-divergence decomposition refers to the process of decomposing the integrated generalization error, which can be expressed as follows: $latex {E= \overline {E}- \overline {A}}$ where the left side E represents the integrated generalization error, and the right side $latex {\over […]
MCMC is an algorithm for sampling from random distributions based on Markov chains. It approximates the posterior distribution of the parameter of interest by randomly sampling in the probability space. The basic theory of MCMC is the Markov process. In related algorithms, in order to sample from a specified distribution, we can simulate the Markov process from any state first.
Evolutionary algorithm is a general problem-solving method that draws on the natural selection and natural genetic mechanisms of the biological world. Basic method: Use simple coding technology to represent various complex structures, use simple genetic operations and natural selection of survival of the fittest to guide learning and determine the search direction; Use population to organize the search, so that […]
Genetic algorithm (GA) is a search algorithm used in computational mathematics to solve optimization problems. It is a type of evolutionary algorithm. Evolutionary algorithms originally borrowed from some phenomena in evolutionary biology, including inheritance, mutation, natural selection, and hybridization. Genetic algorithms are usually implemented in the form of computer simulation. For an optimization problem, if there are […]
Gain ratio usually refers to information gain ratio, which represents the ratio of node information to node split information metric. Gain ratio is usually used as one of the attribute selection methods. The other two common methods are information gain and Gini index. The gain ratio formula is as follows: $latex {GainRatio{ \left( {R} […]
Hilbert space is a complete inner product space, which can be understood as a complete vector space with inner product. Hilbert space is based on finite-dimensional Euclidean space and can be seen as a generalization of the latter. It is not limited to real numbers and finite dimensions, but it is not complete. Like Euclidean space, Hilbert space is also an inner product space, and has distance and angle […]
Hidden Markov Model (HMM) is a probability model about time series, which describes the process of generating an observable random sequence of states from each state. Hidden Markov Model is a statistical model that is used to describe a Markov chain with hidden unknown parameters.
Hidden layers refer to the layers other than the input layer and the output layer in a multi-level feedforward neural network. Hidden layers do not directly receive external signals or send signals to the outside world. They are only needed when data is separated nonlinearly. Neurons on hidden layers can take many forms, such as maximum pooling layers and convolutional layers, which perform different mathematical functions. […]
Hard voting is a voting method that directly outputs class labels, which mainly exists in classification machine learning algorithms. Voting is a combination strategy for classification problems in ensemble learning. Its basic idea is to select the class with the most output in the algorithm. Hard voting is to select the label with the most output by the algorithm. If the number of labels is equal, they are sorted in ascending order. […]
Bias-variance decomposition is a tool to explain the generalization performance of learning algorithms from the perspective of bias and variance. The specific definition is as follows: Assume that there are K data sets, each of which is independently drawn from a distribution p(t,x) (t represents the variable to be predicted, x represents the feature variable). In different […]
Definition: The difference between the expected output and the true label is called bias. The following figure can well illustrate the relationship between bias and variance:
The inter-class scatter matrix is used to represent the distribution of each sample point around the mean. Mathematical definition
Definition Bayesian network is one of the most effective theoretical models in the field of uncertain knowledge expression and reasoning. Bayesian network consists of nodes representing variables and directed edges connecting these nodes. Nodes represent random variables, and directed edges between nodes represent the relationship between nodes. The strength of the relationship is expressed by conditional probability. There is no parent node […]
Basic Concepts Bayesian decision theory is a basic method in statistical model decision making. Its basic idea is: Known class conditional probability density parameter expression and prior probability are converted into posterior probability using Bayesian formula. Decision classification is made based on the size of the posterior probability. Related formula Let D1, D2, ..., Dn be samples […]
In order to minimize the overall risk, the class label that can minimize the risk R(c|x) on the sample is selected, that is, h∗ is the Bayesian optimal classifier.
In model selection, one "best" model is usually selected from a set of candidate models, and then this selected "best" model is used for prediction. Unlike a single best model, Bayesian model averaging assigns weights to each model and performs weighted averaging to determine the final prediction value. The weight assigned to a model is […]
For each sample x, if h can minimize the conditional risk R(h(x)|x), then the overall risk will also be minimized. This gives rise to the Bayes decision rule: to minimize the overall risk, we only need to choose the one that can make the conditional risk R(c|x […]
BN is a set of regularization methods that can speed up the training of large convolutional networks and improve the classification accuracy after convergence. When BN is used in a certain layer of a neural network, it will standardize the internal data of each mini-batch so that the output is normalized to the normal distribution of N(0,1), reducing […]
In ensemble learning, the "individual learners" generated by group are homogeneous. Such learners are called base learners, and the corresponding learning algorithms are called base learning algorithms.
Long Short-Term Memory (LSTM) is a time recurrent neural network (RNN) that was first published in 1997. Due to its unique design structure, LSTM is suitable for processing and predicting important events in time series with very long intervals and delays. […]
Information entropy is a quantity that measures the amount of information. It was proposed by Shannon in 1948. It borrowed the concept of entropy in thermodynamics and called the average amount of information after excluding redundancy in information information entropy. It also gave a related mathematical expression. Three properties of information entropy Monotonicity: The higher the probability of an event, the more information it carries […]
Knowledge representation refers to the representation and description of knowledge. It is concerned with how agents can reasonably use relevant knowledge. This is a study of thinking as a computational process. Strictly speaking, knowledge representation and knowledge reasoning are two closely related concepts in the same research field, but in fact, knowledge representation is also used to refer to the broad concept of reasoning.
Exponential loss function is a commonly used loss function in AdaBoost algorithm. Its function expression is in exponential form. The schematic diagram is as follows. Common loss error Exponential loss Exponential Loss: Mainly used in Adaboost ensemble learning algorithm; Hinge loss H […]
In the field of machine learning, truth refers to the accurate set value of the training set for the classification result in supervised learning, which is generally used for error estimation and effect evaluation. In supervised learning, the labeled data usually appears in the form of (x, t), where x represents the input data and t represents the label. The correct label is Grou […]
Error-divergence decomposition refers to the process of decomposing the integrated generalization error, which can be expressed as follows: $latex {E= \overline {E}- \overline {A}}$ where the left side E represents the integrated generalization error, and the right side $latex {\over […]
MCMC is an algorithm for sampling from random distributions based on Markov chains. It approximates the posterior distribution of the parameter of interest by randomly sampling in the probability space. The basic theory of MCMC is the Markov process. In related algorithms, in order to sample from a specified distribution, we can simulate the Markov process from any state first.
Evolutionary algorithm is a general problem-solving method that draws on the natural selection and natural genetic mechanisms of the biological world. Basic method: Use simple coding technology to represent various complex structures, use simple genetic operations and natural selection of survival of the fittest to guide learning and determine the search direction; Use population to organize the search, so that […]
Genetic algorithm (GA) is a search algorithm used in computational mathematics to solve optimization problems. It is a type of evolutionary algorithm. Evolutionary algorithms originally borrowed from some phenomena in evolutionary biology, including inheritance, mutation, natural selection, and hybridization. Genetic algorithms are usually implemented in the form of computer simulation. For an optimization problem, if there are […]
Gain ratio usually refers to information gain ratio, which represents the ratio of node information to node split information metric. Gain ratio is usually used as one of the attribute selection methods. The other two common methods are information gain and Gini index. The gain ratio formula is as follows: $latex {GainRatio{ \left( {R} […]
Hilbert space is a complete inner product space, which can be understood as a complete vector space with inner product. Hilbert space is based on finite-dimensional Euclidean space and can be seen as a generalization of the latter. It is not limited to real numbers and finite dimensions, but it is not complete. Like Euclidean space, Hilbert space is also an inner product space, and has distance and angle […]
Hidden Markov Model (HMM) is a probability model about time series, which describes the process of generating an observable random sequence of states from each state. Hidden Markov Model is a statistical model that is used to describe a Markov chain with hidden unknown parameters.
Hidden layers refer to the layers other than the input layer and the output layer in a multi-level feedforward neural network. Hidden layers do not directly receive external signals or send signals to the outside world. They are only needed when data is separated nonlinearly. Neurons on hidden layers can take many forms, such as maximum pooling layers and convolutional layers, which perform different mathematical functions. […]
Hard voting is a voting method that directly outputs class labels, which mainly exists in classification machine learning algorithms. Voting is a combination strategy for classification problems in ensemble learning. Its basic idea is to select the class with the most output in the algorithm. Hard voting is to select the label with the most output by the algorithm. If the number of labels is equal, they are sorted in ascending order. […]