Dann brown

I am an Senior Fullstack Software Developer working in my skills and learning new stuffs about tech daily

Fundamentals machine learning and data analitics

Introduction

this post will make a summary of several concepts present in the book of data analitics and machine learning

asociation rule learning

novelty detection

novelly detection is the process of identifying new or unknown data points that differ significantly from the majority of the data. It is often used in fraud detection, network security, and quality control.

sample from the book

“data set from dogs mostly overall big dogs ans a few small dogs then come a chihuahua pictures, the algorithm will identify the chihuahua as a novelty because it is significantly different from the majority of the data (big dogs).”

anomaly detection

sampling noise

sampling bias

Examples of Sampling Bias Perhaps the most famous example of sampling bias happened during the US presidential election in 1936,

clustering

classification

regression

Feature selection (selecting the most useful features to train on among existing features) • Feature extraction (combining existing features to produce a more useful one⁠—as we saw earlier, dimensionality reduction algorithms can help) • Creating new features by gathering new data

Say you are visiting a foreign country and the taxi driver rips you off. You might be tempted to say that all taxi drivers in that country are thieves. Overgeneralizing is something that we humans do all too often, and unfortunately machines can fall into the same trap if we are not careful. In machine learning this is called overfitting

### transferring knowledge from one task to another is called transfer learning, and it’s one of the most important techniques in machine learning today, especially when using deep neural networks (i.e., neural networks composed of many layers of neurons). We will discuss this in detail in Part II

Some photo-hosting services, such as Google Photos, are good examples of this. Once you upload all your family photos to the service, it automatically recognizes that the same person A shows up in photos 1, 5, and 11, while another person B shows up in photos 2, 5, and 7. This is the unsupervised part of the algorithm (clustering). Now all the system needs is for you to tell it who these people are. Just add one label per person⁠3 and it is able to name everyone in every photo, which is useful for searching photos.

References

https://www.baeldung.com/java-mutation-testing-with-pitest
https://www.reddit.com/r/PHP/comments/1n1mi6t/mutation_testing_with_infection/