Word
Description
Mahout Features:
- Mahout offers a framework to perform data mining tasks on large volumes of data
- Mahout enables applications to analyze large data sets efficiently and quickly
- It also offers distributed fitness function capabilities for evolutionary programming..
- Includes multiple MapReduce-enabled cluster implementations, como k-means, fuzzy k-means, Dirichlet y Mean-Shift
A MapReduce framework is generally made up of three operations:
- Map: each worker node applies the map function to local data and writes the output to temporary storage. A master node ensures that only one copy of the redundant input data is processed.
- Shuffle: Worker nodes redistribute data based on output keys (produced by the map function), so that all data belonging to a key is in the same worker node.
- Reduce: Worker nodes now process each group of output data, by key, in parallel.
To learn more about MapReduce, visit here.
When two or more products are purchased, the analysis of the shopping cart is carried out to check if the purchase of a product increases the probability of buying other products. This knowledge is a tool for marketers to group products or design a strategy to cross-sell products to a customer..
Assume that the total sale is 100 $, this total can be divided into subcomponents, In other words, 60 $ base sale, 20 $ price, 18 $ can be distribution and 2 $ may be due to promotional activities. These numbers can be achieved using various logical methods. Each method can lead to a different break. Therefore, it is very important to standardize the procedure for the breakdown of total sales in these components. This formal technique is formally known as MMM or Market Mix Modeling.
As an example, if you have the grades of students in a class and asked how well the class is performing. It would be irrelevant to say the grades of each student, However, can you find the mean of the class, who will be a representative of the class performance.
To find the mean, add all the numbers and then divide by the number of items in the set.
As an example, if the numbers are 1, 2, 3, 4, 5, 6, 7, 8, 8, then the mean would be 44/9 = 4,89.
To find the median of a set of numbers, follow the steps below:
- Arrange the numbers in ascending or descending order
- Find the mean value, which will be n / 2 (where n are the numbers of the set)
Objectives of MIS:
- To drive decision making, providing accurate and up-to-date data on a range of organizational assets.
- Correlate multiple data points to design strategies to drive operations.
- Microsoft Azure Machine Learning Study
- AWS Machine Learning
- IBM Watson Machine Learning
- Google Cloud Machine Learning Engine
- BigML
The mode can be calculated through the following steps:
- Count the number of times each value appears
- Take the value that appears the most
Let's understand with an example:
Suppose we have a data set that has 10 data points, listed below:
4,5,2,8,4,7,6,4,6,3
So now we will calculate the number of times each value has appeared.
Value | Tell |
2 | 1 |
3 | 1 |
4 | 3 |
5 | 1 |
6 | 2 |
7 | 1 |
8 | 1 |
So we see that the value 4 is the one that is repeated the most, In other words, 3 times. Then, the mode of this data set will be 4.
- Exploratory data analysis
- Scientific methods
Some of the criteria for choosing the model may be:
- Akaike information criteria (AIC)
- R adjusted2
- Bayesian information criterion (BIC)
- Likelihood ratio test
As an example, if the goal is to predict the quality of a product, which can be excellent, good, average, regular, little. For this case, the variable has 5 lessons, so it is an obstacle of classification of 5 lessons.
As an example, we can do a bivariate analysis of the combination of two continuous characteristics and find a link between them.
Consider the example: for a given set of details about a student's interests, previous score by subject, etc., want to predict GPA for all semesters (GPA1, GPA2,….). This statement of the problem can be addressed through multivariate regression, since we have more than one dependent variable.