Terminology Related to MachineLearning Machine learning has become a crucial component of many different sectors and professions, ranging from banking to healthcare and everything in between. It is hardlysurprising that machine learning is quickly gaining popularity given its capacity to analyzeand learn from enormous volumes of data. Yet, it is crucial to first comprehend machinelearning's language in order to apply and comprehend it successfully. In this article, we will go over some of the basic concepts and terminology related to machine learning. By the end of this article, you will be able to describe what a feature is,how it relates to a sample, and differentiate between a categorical feature and a numericalfeature. Additionally, we will discuss the various terms used to describe data, samples, andvariables in machine learning. What is a Sample? A sample refers to an instance or example of an entity in your dataset. Typically, a sample is represented as a row in your dataset. For instance, weather data for a particular day canbe considered as a sample. A dataset can consist of numerous samples, each with severalvalues associated with it. What are Variables? Variables, often known as aspects of the sample, are several pieces of information about a sample. Several variables may be present in each sample. For instance, the sample ofmeteorological data for a specific day may include elements like that date's minimumtemperature, highest temperature, and amount of rainfall. Samples and variables arereferred to in machine learning by a variety of terms, including record, example, row,instance, observation, feature, column, dimension, attribute, and field. What are Numeric Variables? Numeric variables are those that take on numerical values, which can be measured and sorted in some order. Numeric variables can be continuous or discrete and have positive,negative, or both types of values. For instance, age is a numeric variable that can have anyreal value.
What are Categorical Variables? Categorical variables are those that have labels, names, or categories for values, instead of numbers. They describe some quality or characteristic of an entity. For example, the colorof a car can be described as red, silver, blue, or black. These values are non-numeric andcan be sorted into categories. Categorical variables are also known as qualitative variablesor nominal variables. Data Types Each variable has a data type associated with it. The most common data types in machine learning are numeric and categorical. However, there are other data types as well, such asstring and date. Numeric variables can have integer or continuous values, while categoricalvariables have labels or categories as values. To Summarize In summary, a sample is an instance or example of an entity in your data, and a variable captures specific characteristics of each entity. A sample can have multiple variables todescribe it, and the data from real applications are often multidimensional. Each variable hasa data type associated with it, and the most common data types are numeric andcategorical. In conclusion, it is essential to comprehend the fundamental terms and principles of machine learning in order to utilize and comprehend it correctly. You can more effectivelyunderstand the complexities of machine learning and analyze and interpret data if you havea thorough understanding of the numerous terminologies used to describe data, samples,and variables.