Feature Engineering
The process of selecting, transforming, and creating input variables from raw data to help machine learning models learn more effectively.
Feature engineering is the process of creating, selecting, and transforming the input variables (features) that a machine learning model uses to make predictions. It is often the single most impactful step in building an effective model.
What is a feature
A feature is any measurable property of the data that the model uses as input. In a customer churn prediction model, features might include: account age, monthly spend, number of support calls, days since last login, and contract type.
Types of feature engineering
- Feature creation β deriving new features from existing data. From a timestamp, you might create day-of-week, hour-of-day, and is-weekend features. From an address, you might calculate distance-to-nearest-store.
- Feature transformation β changing the scale or distribution of features. Log transformations, normalisation, and binning are common examples.
- Feature selection β choosing which features to include. Irrelevant features add noise and slow training. Correlated features provide redundant information.
- Feature encoding β converting non-numerical data (categories, text) into numerical representations the model can process.
Why feature engineering matters
The features you give a model define what it can learn. No algorithm can discover a pattern that is not represented in its input features. A model predicting house prices cannot learn that proximity to good schools matters if it has no feature representing school quality.
Domain knowledge is key
The best features come from understanding the problem domain. A data scientist who understands retail knows that "days since last purchase" is more predictive than "total number of purchases." A finance expert knows that "debt-to-income ratio" matters more than raw income.
Feature engineering vs. deep learning
Deep learning reduces (but does not eliminate) the need for manual feature engineering. Neural networks can learn useful representations from raw data β this is partly why they work so well with images and text. But for structured business data, manual feature engineering combined with gradient-boosted trees often outperforms deep learning.
Automated feature engineering
Tools like Featuretools and TSFresh can automatically generate candidate features. But human judgement is still needed to evaluate which generated features are meaningful versus noisy.
Why This Matters
Feature engineering is where domain expertise meets data science. Business professionals who understand their data are invaluable partners in this process. Knowing what feature engineering is helps you contribute meaningfully to AI projects β your knowledge of customer behaviour, market dynamics, or operational patterns is exactly what makes features predictive.
Related Terms
Continue learning in Practitioner
This topic is covered in our lesson: Building Your First AI Workflow