CSC392/CSC310 Homepage

Programming for Data Science

Spring 2018

Instructor:

Dr. Lutz Hamel
Tyler, Rm 251
Office Hours: Tuesday 2-3pm and Thursday 11am-noon
email: lutzhamel@uri.edu

Announcements:

For other course announcements see Sakai

[3/6/18] The midterm proposal format is here and if you are doing a data analysis project check these guidelines for reports.

[2/6/18] Turns out that the clear command works different on different computers. Here is a way to compute the correct command so that your code stays portable:
##########################################################################                              
import platform

def clear_cmd():

    if platform.win32_ver()[0]:
        return 'cls'
    else:
        return 'clear'

[2/6/18] Display array function:
def display_array(ar):
    "clear the screen, display the contents of an array, wait for 1sec"
    os.system('clear')

    rows = len(ar)    # grab the rows                                           

    if rows == 0:
	raise ValueError("Array contains no data")

    cols = len(ar[0]) # grab the columns - indices start at 0!                  

    for i in range(rows):
	for j in range(cols):
            print(ar[i][j],end=' ') # no carriage return, space separated       
	print()

    time.sleep(1)
[1/22/18] Welcome!

Description:

Data science exists at the intersection of computer science, statistics, and machine learning. That means writing programs to access and manipulate data so that it becomes available for analysis using statistical and machine learning techniques is at the core of data science. Data scientists use their data and analytical ability to find and interpret rich data sources; manage large amounts of data despite hardware, software, and bandwidth constraints; merge data sources; ensure consistency of datasets; create visualizations to aid in understanding data; build mathematical models using the data; and present and communicate the data insights/findings.

This course provides a survey of data science. Topics include data driven programming in Python; data sets, file formats and meta-data; descriptive statistics, data visualization, and foundations of predictive data modeling and machine learning; accessing web data and databases; distributed data management. You will work on weekly substantial programming problems such as accessing data in database and visualize it or build machine learning models of a given data set.

Upon completion of this course

Documents of Interest:

Course

Python

Jupyter Notebooks

Note: Windows users do not need to install Linux or any other OS extensions. Anaconda installs natively on Windows and inserts a Jupyter Notebook Launcher into the start menu.
Note: Jupyter Notebooks only support Chrome, Safari, and Firefox. They do NOT support Internet Explorer or Opera.

Data Sets

Assignments:

Note: Unless otherwise noted, homework must be submitted via Sakai.