data science

Pandas: Quantile

Usage of the Pandas quantile function to analyze Fortune500 data. Data visualizations using Seaborn

Statistical Learning

If we assume that there is a relationship between two or more independent variables and one dependent variable, we can develop a model to predict the value of the dependent variable, given a set of values of the independent variables. For example, if we have a dataset that contains the following data: Sales Price Advertising expenditure We can assume that Sales is the dependent variable, while Price and Advertising Expenditure are the independent variables because the price and the advertising expenditure are directly controlled by the supplier of a product or service.

What is Google BigQuery

Google BigQuery is a serverless scalable data warehouse that is available on Google Cloud. The greatest advantage of Google BigQuery is that it allows companies to store and analyze big data without having to setup and manage infrastructure. Companies can analyze and transform data using famliar ANSI-standard SQL. Google BigQuery can be accessed using an REST API, the Google BigQuery web interface (browser) and a command line tool. By decoupling storage and compute, different layers of the architecture can be executed and scale independently.