Angoss Blog

Angoss Analytics Delivers Intuitive Big Data Mining Solutions

Variable Types: From the Modeller Point of View

Published September 15, 2017.

Storage vs. Analysis Computer scientists and software developers who designed databases and data management systems, did so with the objective of minimizing disk space and making access to the data easy and fast. For example, Oracle database currently allows 26 field types to select from when defining a table.  On the other hand, from the […]

Information Value – A Numerical Example

Published August 10, 2017.

Information Value is a widely used statistic in scorecard development, and in data mining in general. I hope you find the numerical example below on Information Value calculation useful. Information Value is a measure that can be leveraged in order to understand how well an Independent Variable (IV) is able to separate the categories of […]

What does a good decision tree model look like?

Published July 12, 2017.

Decision trees are mainly used, as a predictive model, for two purposes: classification and regression. In classification tasks the purpose is to label the observations with one of a limited number of categories. For example, we want to classify the applications of a credit card into two classes: high risk, and low risk. In regression […]

What makes a good model?

Published June 14, 2017.

There are four main characteristics that can be used to determine the degree of how good a model is. These are 1. Accuracy The accuracy measures how well the model predicts the outcome. When the predictions are close to the actual values (on some validation dataset), the model is deemed to be accurate. Some measures […]

Model Accuracy: Basic Concepts

Published April 26, 2017.

One of the characteristics that a good model should have is to be accurate. In this article, we will discuss what is meant by accuracy and how we define what is meant by an accurate prediction. Let’s first review what we mean when we want our models to be accurate. Accurate models are defined as […]

Crime & Commute in Toronto

Published April 5, 2017.

I was talking to a co-worker and she mentioned how there were parts of Toronto she avoided and that sparked something. Wouldn’t it be nice to know if our commute crosses the more dangerous parts of Toronto? That way I don’t unknowingly get off at a stop in a neighbourhood that has the highest number […]

Reject Inference for Application Scorecards

Published March 29, 2017.

Financial institutions rely on credit scoring models to assess the risks associated with granting credit. In particular, application scorecards are commonly used as decision support mechanisms for customer acquisition and are developed based on approved applicants. Declined applicants are not included in the modeling exercise, which makes sense because their performance is not known. However, […]

Data Prep, no shortcuts to good Modelling

Published March 22, 2017.

Data Preparation is the backbone of any analysis and many varied data preparation procedures are available to access and shape data into an appropriate representation for modelling or reporting. Data Preparation is, as those who are involved will attest, a time consuming task. An array of figures have been quoted to reflect the proportion of […]

Optimization: Moving from Insight to Actionable Foresight

Published March 14, 2017.

You’ve probably seen this chart or one like it recently: Most organizations are finding their analytics efforts are somewhere between descriptive and predictive, few have been able to effectively move from only predictive to prescriptive and rely on rules of thumb or gut feel to apply analytic learnings. Many of those who’ve effectively moved into […]