Data Science

Empower your data teams with practical data science and machine learning capabilities, fusing computer science, statistics, and business. Gain skills such as Python for exploratory data analysis, constructing and refining machine learning models to forecast patterns, and effectively communicating data-driven insights to various stakeholders.

<script charset="utf-8" type="text/javascript" src="//js.hsforms.net/forms/embed/v2.js"></script><script> hbspt.forms.create({ region: "na1", portalId: "8057651", formId: "6a3ff7a7-c223-47d4-9399-1559c6940d97" });</script>

Data Science Bootcamp

Overview:

Live, Instructor-Led Classes
Onsite or Virtual Classes
Bootcamp (Part-Time) : 40 Hours | 10 Weeks
HRD Corp Signature Programme (100% Claimable)

Prerequisites:

This is a fast-paced course with some prerequisites. Learners should be comfortable with programming fundamentals, core Python syntax, and basic statistics. 

Ideal for:

  • Analysts or engineers who want to break into data science.
  • Managers who need to work with technical teams and want to more effectively communicate and empathise with them.

Outcomes:

  • Level 1 Data Skills: Wrangle, explore, model, and communicate the results of multiple analyses with Python and its many packages.
  • Level 2 Data Skills: Work on advanced analytics data sets and explore the capabilities of machine learning.

Curriculum Outline:

  • Pre-Work Data Science Fundamentals
  • Fundamentals
  • Working with Data
  • Data Science Modelling
  • Data Science Applications/Machine Learning

Course Outline

Explore the essentials of Python for data science and applied math through a series of self-paced online preparatory lessons.

• Define basic Python programming concepts and data types, including variables, lists, dictionaries, loops, and functions.
• Create functions that accept multiple arguments and return multiple values.
• Understand the purpose of iterators in real-world data science workflows.
• Describe the use and purpose of DataFrames and how they can be used to manipulate data with Pandas.
• Plot visualizations with Matplotlib and Seaborn.
• Get acquainted with descriptive and inferential statistics and how to calculate them.
• Calculate combinations and permutations.
• Familiarize yourself with developer tools for data science, including GitHub basics and working with the command line.
• Calculate linear algebra and regression equations.

Discover the fundamentals of evidential science by executing basic functions in Python.

What Is Data Science?

  • Define the workflow, tools, and approaches data scientists use to analyze data.
  • Apply the Data Science Workflow to solve a task.

Your Development Environment

  • Navigate through directories using the command line.
  • Use Git and GitHub to share repositories.

Python Foundations

  • Conduct arithmetic and string operations in Python.
  • Assign variables.
  • Implement loops and conditional statements.
  • Use Python to clean and edit data sets.

Project: Complete coding challenges that often appear in data science job interviews, further developing your Python programming skills.

Practice exploratory data analysis for cleaning and aggregating data, and understand the basic statistical testing values of your data.

Exploratory Data Analysis in Panda

  • Use DataFrames and Series to read data.
  • Rename, remove, combine, select, and JOIN data.
  • Identify and handle null and missing values.

Data Visualization in Python

  • Define key principles of data visualization.
  • Create line plots, bar plots, histograms, and box plots using Seaborn and Matplotlib.

Statistics in Python

  • Use NumPy and Pandas libraries to analyze data sets using basic summary statistics.
  • Create data visualizations to discern characteristics and trends in a data set.
  • Identify a normal distribution within a data set using summary statistics and visualization.

Experiments and Hypothesis Testing

  • Determine causality and sampling bias.
  • Test a hypothesis using a sample case study.
  • Validate your findings using statistical analysis (e.g., p values, confidence intervals).

 

Project: Apply your growing Python and analytical skills to conduct a basic exploratory data analysis and answer questions about a real-world data set.

Branch from traditional statistics into machine learning and explore supervised learning techniques including classification and regression.

Linear Regression

  • Define data modeling and linear regression.
  • Differentiate between categorical and continuous variables.
  • Build a linear regression model for prediction using the scikit-learn library.

Train/Test Split

  • Describe errors of bias and variance.
  • Define overfitting and underfitting.
  • Explore k-folds, LOOCV, and three-split methods.

KNN and Classification

  • Build a k-nearest neighbors model using the scikit-learn library.
  • Evaluate and tune the model using metrics such as classification accuracy/error.

Logistic Regression

  • Build a logistic regression classification model using the scikit learn library.
  • Describe the sigmoid function, odds, and odds ratios and how they relate to logistic regression.
  • Evaluate a model using metrics such as classification accuracy/error, confusion matrix, ROC/AOC curves, and loss functions.

 

Project: Build and validate linear regression and KNN models based on a provided data set.

Learn and implement core machine learning models to evaluate complex problems.

Working With API Data

  • Access public APIs and get information back.
  • Read and write data in JSON.
  • Use the requests library.

Natural Language Processing

  • Demonstrate how to tokenize natural language text.
  • Categorize and tag unstructured text data.
  • Perform text classification model using scikit-learn, CountVectorizer, TfidfVectorizer, and TextBlog.

Time Series Data

  • Create rolling means and plot time series data.
  • Examine autocorrelation on time series data.

Flex Sessions

Explore an additional data science topic based on class interest. Options include: clustering, decision trees, robust regression, and deploying models with Flask.

Data Science Immersive

Overview:

Live, Instructor-Led Classes
Onsite or Virtual Classes
Immersive (Full-Time) : 480 Hours | 12 Weeks
HRD Corp Claimable

Prerequisites:

This is a fast-paced course with some prerequisites. Learners are recommended to have a strong mathematical foundation and familiarity with Python and programming fundamentals.

Ideal for:

Those wanting a career transformation. This full-time, award-winning data science course is designed to help learners launch a career in one of the most in-demand fields today.

Outcomes:

Graduates will have a professional-grade capstone project that showcases skills in predictive modelling, pattern recognition, and data visualisation, wrangling massive data sets to forecast trends and inform strategy.

Curriculum Outline:

  • Pre-Work Data ScienceFundamentals
  • Fundamentals
  • Exploratory Data Analysis
  • Classical Statistical Modeling
  • Machine Learning Models
  • Advanced Topics and Trends
  • Post-Training – Career and interview coaching (option for individuals)

Course Outline

Dive into a series of self-paced lessons on the essentials of Python programming and applied math for data science before the course begins.

• Explore fundamental Python programming concepts, including variables, lists, loops, dictionaries, and data sets.
• Leverage programming tools like GitHub and the command line interface to manage data science projects.
• Practice solving coding challenges similar to the questions used in task-based data science interviews.
• Write and run Python functions using multiple arguments.
• Discover how key math concepts like statistical significance and probability distribution are applied throughout data science.

Get acquainted with essential data science tools and techniques, working in a programming environment to gather, organize, and share projects and data with Git and UNIX.

• Demonstrate familiarity with introductory
programming concepts using Python and NumPy
to navigate data sources and collections.
• Utilize UNIX commands to navigate file systems
and modify files.
• Learn to track changes and iterations using Git
version control from your terminal.
• Define and apply descriptive statistical
fundamentals to sample data sets.
• Practice plotting and visualizing data using Python
libraries like Matplotlib and Seaborn.

Project: Apply NumPy and Python programming skills
to answer questions based on a clean data set.

Perform exploratory data analysis. Generate visual and statistical analyses, using Python and its associated libraries and tools to approach problems in fields like finance, marketing, and public policy.

• Design an experimental study with a well-thought-out problem statement and data framework
• Use Pandas to read, clean, parse, and plot data, extracting and rearranging data through indexing, grouping, and JOINing.
• Review statistical testing concepts (p values, confidence intervals, lambda functions, correlation/causation) with SciPy and StatsModels.
• Learn to scrape website data using popular scraping tools.
• Explore bootstrapping, Resampling and building inferences about your data.

Project: Leverage Pandas to apply advanced NumPy and Python skills cleaning, analyzing, and testing data from multiple messy data sets.

Explore effective study design and model evaluation and optimization, implementing linear and logistic regression, and classification models. Collect and connect external data to add nuance to your models using web scraping and APIs.

• Use scikit-learn and StatsModels to run linear and logistic regression models and learn to evaluate model fit.
• Begin to look at classification models by implementing the k-nearest neighbors (kNN) algorithm.
• Articulate the bias-variance trade-off as you practice evaluating classical statistical models.
• Use feature selection to deepen your knowledge of study design and model evaluation.
• Learn to apply optimization and regularization for fitting and tuning models.
• Dive into the math and theory behind how gradient descent helps to optimize loss functions for machine learning models.

Project: Explore, clean, and model data based on a provided data set, outlining your strategy and explaining your results.

Build machine learning models. Explore the differences between supervised and unsupervised learning via clustering, natural language processing, and neural networks.

• Define clustering and its advantages and disadvantages as compared to classification models.
• Build and evaluate ensemble models using decision trees, random forests, bagging, and boosting.
• Get acquainted with natural language processing (NLP) through sentiment analysis of scraped website data.
• Learn how Naive Bayes can simplify the process of analyzing data for supervised learning algorithms.
• Explore the history and use of Hadoop, as well as the advantages and disadvantages of using parallel or distributed systems to store, access, and analyze big data.
• Understand how Hive interacts with Hadoop and discover Spark’s advantages through big data case studies.
• Analyze and model time series data using the ARIMA model.

Project: Students will scrape and model their own data using multiple methods, outlining their approach and evaluating any risks or limitations.

Dive deeper into recommender systems, neural networks, and computer vision models, implementing what you’ve learned to productize models.

• Compare and contrast different types of neural networks and demonstrate how they are fit with back propagation.
• Build and apply basic recommender systems in order to predict on sample user data.
• Work with career coaches to create and polish your professional portfolio.
• Practice with data science case studies to prepare for job interviews.

Project: Choose a data set to explore and model, providing detailed notebook of your technical approach and a public presentation on your findings.

Prefer personalised consultation on corporate training?

<script charset="utf-8" type="text/javascript" src="//js.hsforms.net/forms/embed/v2.js"></script><script> hbspt.forms.create({ region: "na1", portalId: "8057651", formId: "6a3ff7a7-c223-47d4-9399-1559c6940d97" });</script>

WHAT LEARNERS SAY

Solid Course

They provide really solid fundamental of data science. No more blackbox for traditional machine learning.

Data Science Immersive learner

Great For Beginners

The courses are great for beginners

Management Trainee at a leading financial services group

Engaging Class

I really appreciate the instructor's effort to keep us engaged in class. Additional knowledge and sharing from the instructor also allows us to know more about real scenarios.

Analyst of a leading financial services group

AKADEMI GA
is an exclusive partner of General Assembly (GA) in Malaysia. Akademi GA is now a member of the Excelerate Group.

Akademi GA has acquired all rights to market and deliver General Assembly digital courses. It is registered as a training provider with the Ministry of Finance (MOF), Human Resource Development Corporation (HRD Corp) and Malaysia Digital Economy Corporation (MDEC).

Frequently Asked Questions

Yes! Upon passing this course, you will receive a signed certificate of completion. Thousands of GA alumni use their course certificate to demonstrate skills to employers and their LinkedIn networks. GA’s front-end developer course is well-regarded by many top employers, who contribute to our curriculum and use our tech programmes to train their own teams.

Yes! All of our part-time courses are designed for busy professionals with full-time work commitments. 

You will be expected to spend time working on homework and projects outside of class hours each week, but the workload is designed to be manageable with a full-time job.

If you need to miss a session or two, we offer resources to help you catch up. We recommend you discuss any planned absences with your instructor.

For your capstone project, you’ll apply machine learning techniques to solve a real-world problem. You’ll develop a model, technical documentation, and stakeholder presentation, and graduate with a polished, portfolio-ready data science project to showcase your skills. We encourage you to tackle a problem that’s related to your work or a passion project you’ve been meaning to carve out time for.

Throughout the course, you’ll also complete a number of smaller projects designed to reinforce what you’ve learned in each unit.

This course is designed for data professionals who want to perform complex analysis to power predictions and add marketable skills to their resume. You’ll find a diverse range of students in the classroom: 

Data analysts, marketing analysts, BI analysts, or consultants who work with big data and need to upgrade their skills. Software engineers who want to apply their programming skills toward a new career. Other professionals with a quantitative background eyeing a transition to tech.

Ultimately, this programme attracts a community of eager learners who have an interest in manipulating large data sets and forecasting to impact strategy and bottom lines.