Command Palette
Search for a command to run...
Wiki
Machine Learning Glossary: Explore definitions and explanations of key AI and ML concepts
Search for a command to run...
Machine Learning Glossary: Explore definitions and explanations of key AI and ML concepts
Search for a command to run...
Machine Learning Glossary: Explore definitions and explanations of key AI and ML concepts
The general base learner can be composed of Logistic regression, decision tree, SVM, neural network, Bayesian classifier, K-nearest neighbor, etc. If the individual learners are generated from the same learning algorithm from the training data, it can be called a homogeneous ensemble, and the individual learners in this case are also called base learners; the ensemble can also contain different […]
Definition Assume that x is a continuous random variable whose distribution depends on the class state, expressed in the form of p(x|ω). This is the "class conditional probability" function, that is, the probability function of x when the class state is ω. Class conditional probability function $latex P\left(X | w_{i}\ri […]
CART is a learning method for the conditional probability distribution of the output random variable Y given the input random variable X. Definition CART assumes that the decision tree is a binary tree, the internal node features have values of "yes" and "no", the left branch is the branch with the value of "yes", and the right branch is the branch with the value of "no". This […]
Class imbalance is a binary classification problem in which the labels of the two classes have a large difference in frequency. For example, in a disease dataset, 0.0001 of the samples have positive class labels and 0.9999 of the samples have negative class labels, which is a class imbalance problem; but in a […]
Closed form refers to some strict formulas in which any independent variable can be given to find the dependent variable, that is, the solution to the problem. This is a form of solution that includes basic functions such as fractions, trigonometric functions, exponentials, logarithms, and even infinite series. The method used to find the relevant solution is also called analytical method, which is a common calculus […]
Cluster analysis is a technique for statistical data analysis that is widely used in many fields, including machine learning, data mining, pattern recognition, image analysis, and bioinformatics. Clustering is to divide similar objects into different groups or more subsets through static classification methods, so that member objects in the same subset have [...]
Clustering ensemble is an algorithm to improve the accuracy, stability and robustness of clustering results. By integrating multiple base clustering results, a better result can be produced. The basic idea of this method is to cluster the original data set with multiple independent base clusterers, and then use some ensemble method to process it and obtain the best […]
A component of the decoder in a digital remote control system. It consists of a bistable trigger and a coding switch. Each bistable has two states, "1" and "2". When n bistables are cascaded, there are 2n possible combinations. Each combination is a binary code group. The coding switches are connected according to the binary code group. The purpose of the coding matrix is to convert the command […]
One of the conferences on computational learning theory, hosted by ACM and held annually. Computational learning theory can be seen as the intersection of theoretical computer science and machine learning, so it is widely considered a computer science-related conference. Official website: https://learningtheory.org/colt2019 […]
Competitive learning is a learning method of artificial neural networks. When the network structure is fixed, the learning process is reduced to modifying the connection weights. Competitive learning refers to the competition among all units in the network unit group for the right to respond to external stimulus patterns. The connection weights of the winning units change in a direction that is more favorable to the competition for this stimulus pattern.
Component learners are a type of individual learners, which are based on individual learners generated by ensemble learning. When individual learners are generated by different learning algorithms, it is called a heterogeneous ensemble, and these individual learners are called component learners.
Interpretability means that when you need to understand or solve a problem, you can get the relevant information you need. Interpretability at the data level: Let the neural network have a clear symbolic internal knowledge expression to match the human knowledge framework, so that people can diagnose and modify the neural network at the semantic level. Interpretability of machine learning[…]
Classification algorithms in the field of machine learning classify attributes into discrete and continuous ones, where discrete attributes have finite or infinite countable values that may or may not be represented by integers, for example, the attributes hair_color , smoker , medical_test , and drink_size all have finite values […]
Definition Cascade Correlation is a supervised learning architecture that can be used to build a minimal multi-layer network topology. Its advantage is that the user does not need to worry about the topology of the network, and its learning speed is faster than traditional learning algorithms. Correlation Algorithm The Cascade Correlation algorithm is implemented in the following way: Start with a minimal network that only contains input and output […]
Definition Under specified conditions, using a reference standard to assign values to the characteristics of measuring instruments, including reference materials, and determine their indication error. Purpose To determine the indication error and determine whether it is within the expected tolerance range; To obtain a reported value of the deviation from the nominal value, which can be used to adjust the measuring instrument or correct the indication; To give any […]
Definition For the differential equation $latex \frac{d \mathbf{x}}{dt}=\mathbf{f}(t, \mathbf{x}), \mathbf{x} \in \mathbb{R}^{n}$ , if $latex […]
Bootstrapping is a method of uniform sampling with replacement from a given training set, that is, whenever a sample is selected, it is equally likely to be selected again and added to the training set again. The bootstrap method was first proposed by Bradley Efron in Annals of Statistics in 1979. […]
For a sample, the probability of it being collected in a random sampling of a training set containing m samples is 1m. The probability of not being collected is 1−1m. If the probability of not being collected in m samplings is (1−1m)m, then when m→∞, (1−1m)m→1/e≃0 […]
The Boltzmann machine is a type of random neural network and recurrent neural network invented by Geoffrey Hinton and Terry Sejnowski in 1985. The Boltzmann machine can be viewed as a random process that generates corresponding […]
Definition Binary search is an algorithm whose input is an ordered list of elements. If the element to be found is contained in the list, binary search returns its position; otherwise it returns null. Basic idea This method is suitable when the amount of data is large. When using binary search, the data must be sorted. Assume that the data is in ascending order […]
Definition The binomial test compares the observed frequencies of the two categories of a dichotomous variable with the expected frequencies under a binomial distribution with a specified probability parameter, which is 0.5 for both groups by default. Example A coin is tossed and the probability of heads is 1/2. Under this hypothesis, the coin is tossed 40 times [...]
Indicates that there are only two categories in the classification task, for example, we want to identify whether a picture is a cat or not. That is, train a classifier, input a picture, represented by the feature vector x, and output whether it is a cat, represented by y = 0 or 1; two-class classification assumes that each sample is set with one and only one label 0 […]
Definition Deep neural networks have shown superior results in many fields such as speech recognition, image processing, and natural language processing. LSTM, as a variant of RNN, can learn long-term dependencies in data compared to RNN. In 2005, Graves proposed combining LSTM with […]
The bias-variance dilemma means that it is impossible to reduce both bias and variance at the same time, and you can only strike a balance between the two. In a model, if you want to reduce bias, you will increase the complexity of the model to prevent underfitting; but at the same time, you cannot make the model too complex to increase variance and cause overfitting. Therefore, you need to find a balance in the complexity of the model, which can […]
The general base learner can be composed of Logistic regression, decision tree, SVM, neural network, Bayesian classifier, K-nearest neighbor, etc. If the individual learners are generated from the same learning algorithm from the training data, it can be called a homogeneous ensemble, and the individual learners in this case are also called base learners; the ensemble can also contain different […]
Definition Assume that x is a continuous random variable whose distribution depends on the class state, expressed in the form of p(x|ω). This is the "class conditional probability" function, that is, the probability function of x when the class state is ω. Class conditional probability function $latex P\left(X | w_{i}\ri […]
CART is a learning method for the conditional probability distribution of the output random variable Y given the input random variable X. Definition CART assumes that the decision tree is a binary tree, the internal node features have values of "yes" and "no", the left branch is the branch with the value of "yes", and the right branch is the branch with the value of "no". This […]
Class imbalance is a binary classification problem in which the labels of the two classes have a large difference in frequency. For example, in a disease dataset, 0.0001 of the samples have positive class labels and 0.9999 of the samples have negative class labels, which is a class imbalance problem; but in a […]
Closed form refers to some strict formulas in which any independent variable can be given to find the dependent variable, that is, the solution to the problem. This is a form of solution that includes basic functions such as fractions, trigonometric functions, exponentials, logarithms, and even infinite series. The method used to find the relevant solution is also called analytical method, which is a common calculus […]
Cluster analysis is a technique for statistical data analysis that is widely used in many fields, including machine learning, data mining, pattern recognition, image analysis, and bioinformatics. Clustering is to divide similar objects into different groups or more subsets through static classification methods, so that member objects in the same subset have [...]
Clustering ensemble is an algorithm to improve the accuracy, stability and robustness of clustering results. By integrating multiple base clustering results, a better result can be produced. The basic idea of this method is to cluster the original data set with multiple independent base clusterers, and then use some ensemble method to process it and obtain the best […]
A component of the decoder in a digital remote control system. It consists of a bistable trigger and a coding switch. Each bistable has two states, "1" and "2". When n bistables are cascaded, there are 2n possible combinations. Each combination is a binary code group. The coding switches are connected according to the binary code group. The purpose of the coding matrix is to convert the command […]
One of the conferences on computational learning theory, hosted by ACM and held annually. Computational learning theory can be seen as the intersection of theoretical computer science and machine learning, so it is widely considered a computer science-related conference. Official website: https://learningtheory.org/colt2019 […]
Competitive learning is a learning method of artificial neural networks. When the network structure is fixed, the learning process is reduced to modifying the connection weights. Competitive learning refers to the competition among all units in the network unit group for the right to respond to external stimulus patterns. The connection weights of the winning units change in a direction that is more favorable to the competition for this stimulus pattern.
Component learners are a type of individual learners, which are based on individual learners generated by ensemble learning. When individual learners are generated by different learning algorithms, it is called a heterogeneous ensemble, and these individual learners are called component learners.
Interpretability means that when you need to understand or solve a problem, you can get the relevant information you need. Interpretability at the data level: Let the neural network have a clear symbolic internal knowledge expression to match the human knowledge framework, so that people can diagnose and modify the neural network at the semantic level. Interpretability of machine learning[…]
Classification algorithms in the field of machine learning classify attributes into discrete and continuous ones, where discrete attributes have finite or infinite countable values that may or may not be represented by integers, for example, the attributes hair_color , smoker , medical_test , and drink_size all have finite values […]
Definition Cascade Correlation is a supervised learning architecture that can be used to build a minimal multi-layer network topology. Its advantage is that the user does not need to worry about the topology of the network, and its learning speed is faster than traditional learning algorithms. Correlation Algorithm The Cascade Correlation algorithm is implemented in the following way: Start with a minimal network that only contains input and output […]
Definition Under specified conditions, using a reference standard to assign values to the characteristics of measuring instruments, including reference materials, and determine their indication error. Purpose To determine the indication error and determine whether it is within the expected tolerance range; To obtain a reported value of the deviation from the nominal value, which can be used to adjust the measuring instrument or correct the indication; To give any […]
Definition For the differential equation $latex \frac{d \mathbf{x}}{dt}=\mathbf{f}(t, \mathbf{x}), \mathbf{x} \in \mathbb{R}^{n}$ , if $latex […]
Bootstrapping is a method of uniform sampling with replacement from a given training set, that is, whenever a sample is selected, it is equally likely to be selected again and added to the training set again. The bootstrap method was first proposed by Bradley Efron in Annals of Statistics in 1979. […]
For a sample, the probability of it being collected in a random sampling of a training set containing m samples is 1m. The probability of not being collected is 1−1m. If the probability of not being collected in m samplings is (1−1m)m, then when m→∞, (1−1m)m→1/e≃0 […]
The Boltzmann machine is a type of random neural network and recurrent neural network invented by Geoffrey Hinton and Terry Sejnowski in 1985. The Boltzmann machine can be viewed as a random process that generates corresponding […]
Definition Binary search is an algorithm whose input is an ordered list of elements. If the element to be found is contained in the list, binary search returns its position; otherwise it returns null. Basic idea This method is suitable when the amount of data is large. When using binary search, the data must be sorted. Assume that the data is in ascending order […]
Definition The binomial test compares the observed frequencies of the two categories of a dichotomous variable with the expected frequencies under a binomial distribution with a specified probability parameter, which is 0.5 for both groups by default. Example A coin is tossed and the probability of heads is 1/2. Under this hypothesis, the coin is tossed 40 times [...]
Indicates that there are only two categories in the classification task, for example, we want to identify whether a picture is a cat or not. That is, train a classifier, input a picture, represented by the feature vector x, and output whether it is a cat, represented by y = 0 or 1; two-class classification assumes that each sample is set with one and only one label 0 […]
Definition Deep neural networks have shown superior results in many fields such as speech recognition, image processing, and natural language processing. LSTM, as a variant of RNN, can learn long-term dependencies in data compared to RNN. In 2005, Graves proposed combining LSTM with […]
The bias-variance dilemma means that it is impossible to reduce both bias and variance at the same time, and you can only strike a balance between the two. In a model, if you want to reduce bias, you will increase the complexity of the model to prevent underfitting; but at the same time, you cannot make the model too complex to increase variance and cause overfitting. Therefore, you need to find a balance in the complexity of the model, which can […]