Quantitative Data Management, Analysis And Visualization With Python Worshop

  • Overview

INTRODUCTION

You'll be guided by this thorough course as you learn how to harness the power of Python to analyze massive data, produce stunning visualizations, and employ potent machine learning techniques. Both novice programmers with some expertise and seasoned developers wishing to transition to data science and large data analysis should take this course.

COURSE OBJECTIVES

At the end of course participants should be able to understand:

  • Research Design
  • Python for Data Science and Machine
  • Spark for Big Data Analysis
  • Implement Machine Learning Algorithms
  • Numbly for Numerical Data
  • Pandas for Data Analysis
  • Matplotlib for Python Plotting
  • Seaborn for statistical plots
  • interactive dynamic visualizations
  • SciKit-Learn for Machine Learning Tasks
  • K-Means Clustering, Logistic Regression and Linear Regression
  • Random Forest and Decision Trees
  • Natural Language Processing and Spam Filters
  • Neural Networks
  • Support Vector Machines
  • Research report writing

DURATION

10 Days

WHO SHOULD ATTEND

The course targets participants with elementary knowledge of Statistics from Agriculture, Economics, Food Security and Livelihoods, Nutrition, Education, Medical or public health professionals among others who already have some statistical knowledge, but wish to be conversant with the concepts and applications of statistical modeling using Phython

COURSE CONTENT

Module1: Basic statistical terms and concepts

  • Introduction to statistical concepts
  • Descriptive Statistics
  • Inferential statistics

Module 2: Research Design

  • The role and purpose of research design
  • Types of research designs
  • The research process
  • Which method to choose?
  • Exercise: Identify a project of choice and developing a research design

Module 3: Survey Planning, Implementation and Completion

  • Types of surveys
  • The survey process
  • Survey design
  • Methods of survey sampling
  • Determining the Sample size
  • Planning a survey
  • Conducting the survey
  • After the survey
  • Exercise: Planning for a survey based on the research design selected

Module 4: Introduction to Phython

  • Course Intro
  • Setup
  • Installation Setup and Overview
  • IDEs and Course Resources
  • iPython/Jupyter Notebook Overview

Module 5:Learning Numpy

  • Intro to numpy
  • Creating arrays
  • Using arrays and scalars
  • Indexing Arrays
  • Array Transposition
  • Universal Array Function
  • Array Processing
  • Array Input and Output

Module 6: Intro to Pandas

  • DataFrames
  • Index objects
  • Reindex
  • Drop Entry
  • Selecting Entries
  • Data Alignment
  • Rank and Sort
  • Summary Statistics
  • Missing Data
  • Index Hierarchy

Module 7: Working with Data

  • Reading and Writing Text Files
  • JSON with Python
  • HTML with Python
  • Microsoft Excel files with Python
  • Merge and Merge on Index
  • Concatenate and Combining DataFrames
  • Reshaping, Pivoting and Duplicates in Data Frames
  • Mapping,Replace,Rename Index,Binning,Outliers and Permutation
  • GroupBy on DataFrames
  • GroupBy on Dict and Series
  • Splitting Applying and Combining
  • Cross Tabulation

Module 8:Big Data and Spark with Python

  • Welcome to the Big Data Section!
  • Big Data Overview
  • Spark Overview
  • Local Spark Set-Up
  • AWS Account Set-Up
  • Quick Note on AWS Security
  • EC2 Instance Set-Up
  • SSH with Mac or Linux
  • PySpark Setup
  • Lambda Expressions Review
  • Introduction to Spark and Python
  • RDD Transformations and Actions

Module 9: Data Visualization

  • Installing Seaborn
  • Histograms
  • Kernel Density Estimate Plots
  • Combining Plot Styles
  • Box and Violin Plots
  • Regression Plots
  • Heatmaps and Clustered Matrices

Module 10: Data Analysis

  • Linear Regression
  • Support Vector
  • Decision Trees and Random Forests
  • Natural Language Processing
  • Discrete Uniform Distribution
  • Continuous Uniform Distribution
  • Binomial Distribution
  • Poisson Distribution
  • Normal Distribution
  • Sampling Techniques
  • T-Distribution
  • Hypothesis Testing and Confidence Intervals
  • Chi Square Test and Distribution

Module 11: Report writing for surveys, data dissemination, demand and use

  • Writing a report from survey data
  • Communication and dissemination strategy
  • Context of Decision Making
  • Improving data use in decision making
  • Culture Change and Change Management
  • Preparing a report for the survey, a communication and dissemination plan and a demand and use strategy.
  • Presentations and joint action planning

GENERAL NOTES

  • Our seasoned instructors, who have years of experience as seasoned professionals in their respective fields of work, will be teaching this course. A combination of practical exercises, theory, group projects, and case studies are used to teach the course.
  • The participants receive training manuals and supplementary reading materials.
  • Participants who complete this course successfully will receive a certificate.
  • We can also create a course specifically for your organization to match your needs. To learn more, get in touch with us at training@dealsontrainers.org.
  • The training will take place at DEALSON TRAINERS IN NAIROBI, KENYA in Nairobi, Kenya.
  • The training fee includes lunch, course materials, and lodging for the training session. Upon request, we may arrange for our participants' lodging and transportation to the airport.
  • Payment must be made to our bank account before the training begins, and documentation of payment should be emailed to training@dealsontrainers.org

Course Schedule:
Dates Duration Fees Location Action