Saturday, 2 September 2017

Let’s understand Big Data

Of late, I have been working in the field of data science. I thought I would share with you all what I have been learning. I will start with the term Big Data.

What's Big Data?

During the last few years, you have been hearing everyone saying “Big Data”. It has become a buzz word in most of the discussions. You might be thinking what is this Big Data about.
  • ·       Is it Huge data?
  • ·       Is it Large Amounts of Data?
  • ·       Is it meant for Big People?
  • ·       What is it all about?
  • ·       And, more importantly, how to identify the big data, and what is the use of it.

Today, I want to put the concept in clear and simple terms so that you can understand what it is and identify by yourself what the big data is in any context.

Let us define Big Data formally first –
Big Data is massive unstructured data that has variety, velocity, volume, and veracity, which can be used in decision making.

Massive Unstructured Data

Of late, lots and lots of Data is becoming available from various sources, in different formats.

Variety

When we say variety, we mean the different types of data sources that are becoming available – text, audio, video, click streams, log files and more.


Velocity

Data is frequently time-sensitive and must be used simultaneously with its stream into the enterprise to maximize its value.















Volume
Terabytes and even petabytes of information is becoming available.  It calls for scalable storage and a distributed approach to querying.  

Veracity
With information flowing in from so many different sources, at such a speed and in huge quantities that need to be used immediately, the important questions are - How much reliable the data is? How accurate the data is?
Data can be ambiguous, inconsistent, spam, that needs to be filtered out before taking major decisions.


Decision Making based on Data

The importance of data in decision-making is recognized worldwide. And, data is flowing in big volumes, with high velocity, from various sources of different types, bringing uncertainty alongside.  

Data Scientists – The most sought after people in the corporate

The task is
  • ·       to identify what data is relevant,
  • ·       what are the sources available,
  • ·       how to gather the data as it flows in,
  • ·       how to clean it, and shape it,
  • ·       how to model it to draw data insights,
  • ·       how to report the findings

Each of these is a big topic on its own.
Now, I think you can appreciate the need for data scientists in the corporates and how interesting and promising the career opportunity is.


Friday, 28 July 2017

Learn to Write DAX Online Training by Matt Allington

Learn to Write DAX Online is a combination of self-paced remote learning using the book “Learn to Write DAX“ by Matt Allington, weekly video based training with examples and demos to support the complex topics in the book, and weekly live screen sharing Q&A sessions with Matt Allington.


This training course has been specifically designed to provide the following benefits.
  • The training is run in semesters so that you can be part of a class of people that are all learning together.
  • You get to learn incrementally over a 5 week period in your own time.  Just find a couple of hours each week when you can spare the time to focus.
  • There is a weekly video where Matt Allington explains the concepts that are harder to learn yourself.
  • You read the chapters assigned for each week, do the exercises in those chapters and then watch the video to summarise the week.
  • You can watch the weekly videos at any time in the week that suits you.
  • You get direct access to Matt Allington to ask for clarification to any questions you have during the weekly Q&A session.  You can ask any questions you have at these sessions.
  • You can purchase the course and start-off on the course at any time, and you will get access to the lessons and weekly videos immediately. However, you might have to wait for Q&A sessions until the next semester.
  • You can choose the appropriate session times for the weekly Q&A sessions that suit your time zone.
  • You will have access to the online course material (videos) for 12 months, so if you want to come back and watch them again then that is fine.  Also, you will receive the Q&A session recording through email and you can keep a copy for future reference.
  • The cost is very affordable compared with a comparable live training course (about 80% cheaper)
If you are interested in enrolling for the course, please leave your contact details in the comments.

Tuesday, 28 March 2017

Financial Risks


Hi Friends,
Let’s recap the definition of Risk that I had given in my Blog Post – Risk Management.
“A Risk is any event that if happens can have a significant influence on execution”.
·         if happens – amounts to the probability of happening – meaning it may happen or may not happen, but there is a likelihood of happening.
·         Influence – amounts to impact on the execution.
Risk – An example –

In this blog, I will discuss Financial Risks.

What is a Financial Risk?

A financial risk is any risk associated with financing, with the potential for financial loss and uncertainty about its extent.

The term financial risk by itself is broad but can be understood better if we consider the different types of financial risks.

Types of Financial Risks

Financial risks can be categorized as follows –

Asset-backed Risk: The changes in one or more assets that support an asset-backed security will significantly impact the value of the supported security. For e.g. home loans. In order to finance home sales, banks issue bonds that serve as a debt obligation to its buyer. The buyer of the debt is essentially receiving the interest from the bank that the homebuyer is paying to it.
Credit Risk: A credit risk is the risk of default on a debt that may arise from a borrower failing to make required payments.
Liquidity Risk: The risk that a given security or asset cannot be traded quickly enough in the market to prevent a loss or make the required profit. 
Market Risk: Market risk is the risk of losses in positions arising from movements in market prices.
Operational Risk: The Basel II Committee defines operational risk as: "The risk of loss resulting from inadequate or failed internal processes, people, and systems or from external events."

Financial Risks – Sub-categories

The risk categories, in turn, have several sub-categories, as depicted in the picture below.



Asset-backed risks include interest rate, term modification, and prepayment risk.
*Prepayment risk is the risk that the buyer goes ahead and pays off the mortgage. Therefore, the buyer of the bond loses the right to the buyer's interest payments over time.
*Interest rate risk refers an asset whose terms can change over time, such as a Variable Rate Mortgage payment.

A Credit risk can be of the following types:
*Credit default risk – The risk of loss arising from a debtor being unlikely to pay the loan obligations in full or the debtor is more than 90 days past due on any material credit obligation. Default risk may impact all credit-sensitive transactions, including loans, securities, and derivatives.
*Concentration risk – The risk associated with any single exposure or group of exposures with the potential to produce large enough losses to threaten a bank's core operations. It may arise in the form of single name concentration or industry concentration.
*Country risk – The risk of loss arising from a sovereign state freezing foreign currency payments (transfer/conversion risk) or when it defaults on its obligations (sovereign risk). This type of risk is prominently associated with the country's macroeconomic performance and its political stability.

Liquidity risk can be of the following:
*Asset liquidity - An asset cannot be sold due to lack of liquidity in the market - essentially a sub-set of market risk. 
*Funding liquidity - Risk that liabilities:
- Cannot be met when they fall due
- Can only be met at an uneconomic price
- Can be name-specific or system

The most commonly used types of Market risk are:
*Equity risk - the risk that stock or stock indices prices or their implied volatility will change.
*Interest rate risk - the risk that interest rates or their implied volatility will change.
*Currency risk - the risk that foreign exchange rates or their implied volatility will change.
*Commodity risk - the risk that commodity prices or their implied volatility will change.

Official Basel II types of Operational risks are the following:
*Internal Fraud – misappropriation of assets, tax evasion, intentional mismarking of positions, bribery.
*External Fraud – theft of information, hacking damage, third-party theft, and forgery
*Employment Practices and Workplace Safety – discrimination, workers’ compensation, employee health, and safety.
*Clients, Products, and Business Practice – market manipulation, antitrust, improper trade, product defects, fiduciary breaches, account churning.
*Damage to Physical Assets – natural disasters, terrorism, vandalism.
*Business Disruption and Systems Failures – utility disruptions, software failures, hardware failures.
*Execution, Delivery, and Process Management data entry errors, accounting errors, failed mandatory reporting, negligent loss of client assets.


Wednesday, 22 March 2017

Microsoft Excel – Spreadsheet to Business Intelligence


Hi Friends, we have been using Microsoft Office for our different purposes both at home and office. Though Excel made its entry as a substitute to the legacy spreadsheet applications, such as Lotus 1-2-3, it went a long way with versatility and user-friendliness enabling higher up data analysis.

At its core, Excel is widely used for data entry. From the novice users to Excel Pros, there is a large toolset that is available in the current versions of Excel.

Excel – Primary Data Tools

Let us start examining the basic tools that Excel provided for data processing –
  • Data Ranges
  • Data Tables
  • Charts

You can enter data in rows and columns in a worksheet in an Excel workbook. You can either work on that data, considering it as a data range or convert it to a table for more sophisticated operations.
You can visualize the data patterns using Charts, and Excel has a Recommended Charts option that suggests you the appropriate chart types based on your data.

Next Level of Tools

When you have large data sets, you might want to aggregate and summarize the data. For this purpose, Excel introduced -
  • PivotTables
  • Pivot Charts

PivotTables, help you examine the different facets of the data and generate specific reports. Further, Excel allows you to change the PivotTables dynamically that facilitates portraying the data results during presentations and changing them on the fly to answer the questions that are raised.
Pivot Charts is a cousin of PivotTables and portrays the data in chart forms instead of tables.

VLOOKUP and HLOOKUP

Excel has 500+ built-in functions that enable you to perform the required operations on the raw data to produce the desired results. Of these, the LOOKUP functions got a wide usage. If your data is in two tables, and if you need to combine the data from both the tables, VLOOKUP function comes handy. For e.g. if you have a Products table and a Sales table, you can obtain the sales data for a particular product with VLOOKUP.

Though VLOOKUP was the most sought after Excel function, it has its drawbacks, which have been solved with the introduction of other tools in Excel.

Data Analysis Tools in Excel

This is where Excel began its superior performance. Excel crossed its spreadsheet boundaries by a realm of data analysis add-ins. You can find the following data analysis tools handy if your data sets are not too huge.
  • What-if Analysis
  • Forecasting (predicting data trends)
  • Analysis Toolpak (for statistical analysis)
  • Solver (for optimization and equation solving)


Big Data Evolution

Excel Pros were quite satisfied with the data analysis tools that are available, restraining to single tables of data, spanning around some hundreds of rows.
Then, Big Data has become a buzz word in the industry and anything and everything related to data started being called Big Data.

Let us pause here to understand the background of this commotion. The data sources have become vast – from a simple text message to databases containing millions of rows of data. Data is available through the web. The top managers of the companies have realized that if their decision making is based on this availability of data from various sources, not only the decisions would be fruitful, but also, they can have purview of data trends so that they can take immediate actions when necessary. This has raised the expectations of the management from the data analysts.

Data Science is the most highly paid domain now and several tools catering the needs of data analysts came into existence. But, if we observe closely, most of these tools are more into reporting with various visualizations and dashboards.

Yes, reporting is one of the tasks of a data analyst. But, before reporting, the data is to be analyzed and the key data insights are to be brought out so as to enable the decision makers focus on the appropriate and relevant information. Further, as the data is obtained from several sources, the data in its raw format might require cleaning and shaping before it is subjected to analysis.

So, the order is – Data Cleaning, Shaping, Analysis, and Reporting. This is where Microsoft paved its way through Excel.

Excel Power Tools

Microsoft introduced the Power Tools to handle the data analysis for decision making.
  • Power Pivot
  • Power Query
  • Power View
  • Power Map

Microsoft also came up with a standalone tool for Business Intelligence – Power BI.

I will cover more details on Power Tools in the next blog.