0% found this document useful (0 votes)

3 views

CS3352-Foundations-of-Data-Science-Nov-Dec-2022-Question-Paper-Download (1)

This document is a question paper for the B.E/B.Tech. degree examinations in Computer Science and Engineering, specifically for the course CS 3352 - Foundations of Data Science. It includes various questions divided into three parts, covering topics such as data science definitions, data analysis techniques, variable types, and exploratory data analysis. The paper is structured to assess students' understanding of data science concepts and their practical applications using Python.

Uploaded by

crackersff

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

CS3352-Foundations-of-Data-Science-Nov-Dec-2022-Question-Paper-Download (1)

Uploaded by

crackersff

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

EnggTree.

com

Reg.No.: |[E IN [G|G|T|R|E|E|/.|[C|OIM

Question Paper Code : 70072

B.E/B.Tech. DEGREE EXAMINATIONS, NOVEMBER/DECEMBER 2022.

Third Semester

Computer Science and Engineering

CS 3352 - FOUNDATIONS OF DATA SCIENCE

(Common to: Computer and Communication Engineering / Information Technology)

(Regulations 2021)

Time : Three hours Maximum : 100 marks

For More Visit our Website Answer ALL questions.

EnggTree.com PART A — (10 x 2 = 20 marks)

1. Define Data Science and Big Data.

2. List an overview of common errors in retrieving data and which cleansing

solutions to be employed.

3. Classify the below list of data into their types: (a) ethnic group (b) age
(¢) family size (d) academic major (e) sexual preference (f) IQ score
(g) net worth (dollars) (h) third-place finish (i) gender (j) temperature and write
a brief note on them.

4. Differentiate discrete and continuous variables.

What is a percentile rank? Give an example.

6. Consider Helen sent 10 greeting cards to her friends and she received back
8 cards, what is the kind of relationship it is? Brief on it.

7. List the attributes of a Numpy array. Give an example for it.

8. Create a data frame with key and data pairs as Key-Data pair as A-10, B-20,
A-40, C-5, B-10, C-10. Find the sum of each key and display the result as each
key group.

9. What is the purpose of errorbar function in Matplotlib? Give an example.

10. Showcase 3-dimensional drawing in Matplotlib with corresponding Python

Code.

Downloaded from EnggTree.com

EnggTree.com

PART
B — (5 x 13 = 65 marks)

1l (a) Examine the different facets of data with the challenges in their
processing.
Or
) Explore the various steps associated with data science process and
explain any three steps of it with suitable diagrams and example.

12 (a) Demonstrate the different types of variables used in data analysis with
an example for each.
Or
(®) The number of friends reported by Facebook users is summarized in the
following frequency distribution.
FRIENDS f
400 - above 2
350 - 399 5
300 - 349 12
250 - 299 17
200 - 249 23
150 - 199 49
100 - 149 27
50-99 29
0-49 30
Total 200
(i) What is the shape of this distribution?
(i) Find the relative frequencies.
(ii)) Find the approximate percentile rank of the interval 300-349.
(iv) Convert to a histogram.
(v) Why would it not be possible to convert to a stem and leaf display?

13. (a) () Categorize the different types of relationships using Scatter

plots. U]
(i) Each of the following pairs represents the number of licensed
drivers (X) and the number of cars (Y) for seven houses in my
neighborhood:
Drivers
(X) Cars (Y)
0NN W
wswoao
N

2 70072
Downloaded from EnggTree.com
EnggTree.com

(1) Construct a scatterplot to verify a lack of pronounced

curvilinearity. 2)
(2) Determine the least squares equation for these data.
(Remember, you will first have to calculate r, SSy and SSx) (2)
(3) Determine the standard error of estimate, Sy/x, given that
n="1 @)

Or
®) (i) In studies dating back over 100 years, it's well established that
regression toward the mean occurs between the heights of fathers
and the heights of their adult Sons.
Indicate whether the following statements are true or false.
(1) Sons of tall fathers will tend to be shorter than their fathers.
(¢V]
(2) Sons of short fathers will tend to be taller than the mean for
all sons. (1)
(3) Every
son of a tall father will be shorter
than his father. (1)
(4) ‘Taken as a group, adult sons are shorter than their fathers. (1)
(5) Fathers of tall sons will tend to be taller than their sons. (1)
(6) Fathers of short sons will tend to be taller than their sons but
shorter than the mean for all fathers. m
(i) Interpret the value of r? in correlation based analysis. (U]

14. @ Imagine you have a series of data that represents the amount of
precipitation each day for a year in a given city. Load the daily rainfall
statistics for the city of Chennai in 2021 which is given in a csv file
Chennairainfall2021.csv using Pandas generate a histogram for rainy
days, and find out the days that have high rainfall.
Or
®) Consider that, an E-Commerce organization like Amazon, have different
regions sales as NorthSales, SouthSales, WestSales, EastSales.csv files.
They want to combine North and West region sales and South and East
sales to find the aggregate sales of these collaborating regions Help them
to do so using Python code.

15. (a) How text and image annotations are done using Python? Give an
example of your own with appropriate Python code.

Or
®) Appraise the following (i) Histograms (ii) Binnings (iii) Density with
appropriate Python code.

3 70072

Downloaded from EnggTree.com

EnggTree.com

PART C— (1 x 15 = 15 marks)

16. (a) Perform an exploratory data analysis for the following data with different
types of plots:
The dataset contains cases from a study that was conducted between
1958 and 1970 at the University of Chicago’s Billings Hospital on the
survival of patients who had undergone surgery for breast cancer.
Data attributes:-
Age of patient at the time of operation (numerical)
Patient’s year of operation (year — 1900, numerical)
Number of positive axillary nodes detected (numerical)
Survival status (class attribute) 1 = the patient survived 5 years or
longer, 2 = the patient died within 5 year

Or
®) Assume that an r of — .80 describes the strong negative relationship
between years of heavy smoking (X) and life expectancy (Y).
Assume, furthermore, that the distributions of heavy smoking and life
expectancy each have the following means and sums of squares: 5 60 35
70xyXYSSSS
(i) Determine the least squares regression equation for predicting life
expectancy from years of heavy smoking. 3
(ii) Determine the standard error of estimate, Sy/x, assuming that the
correlation of —.80 was based on n = 50 pairs of observations. 3)
(iii) Supply a rough interpretation of Sy/x. 3)
(iv) Predict the life expectancy for John, who has smoked heavily for
8 years. ®)
(v) Predict the life expectancy for Katie, who has never smoked
heavily. ®3)

4 70072

Downloaded from EnggTree.com

Task File of SIT743 Bayesian Learning and Graphical Models Assignment 2-1-1621616636
No ratings yet
Task File of SIT743 Bayesian Learning and Graphical Models Assignment 2-1-1621616636
11 pages
Assignment # 1
No ratings yet
Assignment # 1
3 pages
Week 10 Empowerment Technologies (TVL Track)
No ratings yet
Week 10 Empowerment Technologies (TVL Track)
7 pages
海关数据
No ratings yet
海关数据
125 pages
Fods Question Paper (1)(1)
No ratings yet
Fods Question Paper (1)(1)
4 pages
fds model 2 qc
No ratings yet
fds model 2 qc
3 pages
FDS Important Q
No ratings yet
FDS Important Q
5 pages
April may 2023 FODS arrear
No ratings yet
April may 2023 FODS arrear
3 pages
web technology
No ratings yet
web technology
1 page
Fds II Internal Qp
No ratings yet
Fds II Internal Qp
2 pages
MATH 141 (2)
No ratings yet
MATH 141 (2)
6 pages
Assigment 1
No ratings yet
Assigment 1
8 pages
FDS Iat-2 Part-B
No ratings yet
FDS Iat-2 Part-B
4 pages
1-3-correlation--regression-jYMCtkvRAlEsm
No ratings yet
1-3-correlation--regression-jYMCtkvRAlEsm
51 pages
ICT 1 - TEST 3 QB FODS
No ratings yet
ICT 1 - TEST 3 QB FODS
2 pages
Reg 2017
No ratings yet
Reg 2017
13 pages
DBMS
No ratings yet
DBMS
15 pages
EC3401-Networks-and-Security-Apr-May-2024-Question-Paper-Download (2)
No ratings yet
EC3401-Networks-and-Security-Apr-May-2024-Question-Paper-Download (2)
3 pages
M E - 3 7 3 5 S Eco N D Y Ear B. C. A. (Sem - I LL) E X A M in A Tio N 301: S Ta Tis Tic A L M Eth Ods
No ratings yet
M E - 3 7 3 5 S Eco N D Y Ear B. C. A. (Sem - I LL) E X A M in A Tio N 301: S Ta Tis Tic A L M Eth Ods
4 pages
S1 Correlation and Regression - Regression
No ratings yet
S1 Correlation and Regression - Regression
42 pages
3 Hours / 70 Marks: Seat No
No ratings yet
3 Hours / 70 Marks: Seat No
32 pages
Adobe Scan Sep 27, 2024 (2)
No ratings yet
Adobe Scan Sep 27, 2024 (2)
8 pages
Three Assignment Questions
No ratings yet
Three Assignment Questions
10 pages
Sta 32101 Questions-Descriptives
No ratings yet
Sta 32101 Questions-Descriptives
7 pages
DDS(2022S)
No ratings yet
DDS(2022S)
3 pages
Computer Paper12
No ratings yet
Computer Paper12
15 pages
SY BSC Computer Science PDF
No ratings yet
SY BSC Computer Science PDF
96 pages
DBMS Question
No ratings yet
DBMS Question
57 pages
B.SC (Computer Science) 2013 Pattern PDF
No ratings yet
B.SC (Computer Science) 2013 Pattern PDF
129 pages
Business and Eco PYQ 24-output
No ratings yet
Business and Eco PYQ 24-output
6 pages
Data Visualization Using Spreadsheet SEC-1(2023)
No ratings yet
Data Visualization Using Spreadsheet SEC-1(2023)
3 pages
Study Material XI Computer Science
50% (2)
Study Material XI Computer Science
120 pages
Design and Analysis Cat (1)
No ratings yet
Design and Analysis Cat (1)
2 pages
65 [Hs Xii a Sc Com Ip 22]
No ratings yet
65 [Hs Xii a Sc Com Ip 22]
11 pages
F. Y. B. Sc. (Computer Science) Examination - 2010: Total No. of Questions: 5) (Total No. of Printed Pages: 4
No ratings yet
F. Y. B. Sc. (Computer Science) Examination - 2010: Total No. of Questions: 5) (Total No. of Printed Pages: 4
76 pages
Database Management Systems: CS/B.TECH (CSE) /SEM-5/CS-502/2011-12
No ratings yet
Database Management Systems: CS/B.TECH (CSE) /SEM-5/CS-502/2011-12
7 pages
Nov Dec 2023
No ratings yet
Nov Dec 2023
3 pages
CS1B April 2019 ExamPaper
No ratings yet
CS1B April 2019 ExamPaper
5 pages
XI_MIDTERM_for practice
No ratings yet
XI_MIDTERM_for practice
5 pages
Assignment 4 Corrected
No ratings yet
Assignment 4 Corrected
3 pages
Past Exam
No ratings yet
Past Exam
4 pages
PG Dast 2019
No ratings yet
PG Dast 2019
29 pages
(Backlog) Csen 2001
No ratings yet
(Backlog) Csen 2001
3 pages
S.6 Maths LCB S6 P2 Exercise 1 Revision Past Papers
No ratings yet
S.6 Maths LCB S6 P2 Exercise 1 Revision Past Papers
4 pages
cbleippu14b
No ratings yet
cbleippu14b
8 pages
Cosc 325 Exam
No ratings yet
Cosc 325 Exam
3 pages
Advanced Databases Jan 2024
No ratings yet
Advanced Databases Jan 2024
2 pages
cbleippu01
No ratings yet
cbleippu01
9 pages
P&S_PreviousQPS
No ratings yet
P&S_PreviousQPS
2 pages
Digital Image and Video Processing - 2012
No ratings yet
Digital Image and Video Processing - 2012
7 pages
Y 2009 PAPER1
No ratings yet
Y 2009 PAPER1
10 pages
B.SC (Cyber and Digital Science) 2020 Pattern
No ratings yet
B.SC (Cyber and Digital Science) 2020 Pattern
10 pages
IT3020 - Database Systems
No ratings yet
IT3020 - Database Systems
6 pages
Arch. Assignments Stat.
No ratings yet
Arch. Assignments Stat.
3 pages
Statistics 2 S17
No ratings yet
Statistics 2 S17
3 pages
ADA Assignment - Final - 2022
No ratings yet
ADA Assignment - Final - 2022
6 pages
Creating Real Network With Expected Degree Distribution: A Statistical Simulation
No ratings yet
Creating Real Network With Expected Degree Distribution: A Statistical Simulation
8 pages
BCA 5th Sem Dec 2018
No ratings yet
BCA 5th Sem Dec 2018
11 pages
Informatics Practices
No ratings yet
Informatics Practices
8 pages
Math Practice Tests For The ACT
From Everand
Math Practice Tests For The ACT
Vibrant Publishers
No ratings yet
Managing Subsurface Data in the Oil and Gas Sector Seismic: Seismic
From Everand
Managing Subsurface Data in the Oil and Gas Sector Seismic: Seismic
Ahmad Bin Maidinsar
No ratings yet
Sat Mathematics Review And Practice
From Everand
Sat Mathematics Review And Practice
Addison Shaw
1/5 (1)
Minecraft Free Options
No ratings yet
Minecraft Free Options
4 pages
Learn CSS Animations
No ratings yet
Learn CSS Animations
5 pages
JavaScript Learning Guide
No ratings yet
JavaScript Learning Guide
6 pages
Apple Juice CSS Help
No ratings yet
Apple Juice CSS Help
1 page
Data Modeling Relational Model
No ratings yet
Data Modeling Relational Model
2 pages
Add button in scene
No ratings yet
Add button in scene
3 pages
Batch 1 Set 2 OOPS
No ratings yet
Batch 1 Set 2 OOPS
26 pages
Batch 2 Set 8 OOPS
No ratings yet
Batch 2 Set 8 OOPS
31 pages
oops set 2
No ratings yet
oops set 2
4 pages
Ged Science Xmind
No ratings yet
Ged Science Xmind
8 pages
Building Clearing / Tactical Raid
100% (1)
Building Clearing / Tactical Raid
26 pages
Qualities of A Good Reporter
0% (1)
Qualities of A Good Reporter
20 pages
PCS7 OpenOS PU Integration S7300 V9.0 en
No ratings yet
PCS7 OpenOS PU Integration S7300 V9.0 en
53 pages
QMB12 CH 6 A
No ratings yet
QMB12 CH 6 A
56 pages
NC, CNC and DNC
67% (3)
NC, CNC and DNC
7 pages
Module 8 Inventory Excel Template
No ratings yet
Module 8 Inventory Excel Template
17 pages
CV MinhHieu
No ratings yet
CV MinhHieu
2 pages
2019 WASTE HEAT RECOVERY 11-COM.P-18-rev.64 PDF
No ratings yet
2019 WASTE HEAT RECOVERY 11-COM.P-18-rev.64 PDF
41 pages
02.3standard Rubber Test Methods
No ratings yet
02.3standard Rubber Test Methods
46 pages
IndusPowerElecs Act3 Batilo - Guian
No ratings yet
IndusPowerElecs Act3 Batilo - Guian
2 pages
CSR Proposal Broga Hill
No ratings yet
CSR Proposal Broga Hill
9 pages
LSP 401 Fe q1 - Sample RC
No ratings yet
LSP 401 Fe q1 - Sample RC
11 pages
Detailed Lesson Plan in Science IV
No ratings yet
Detailed Lesson Plan in Science IV
4 pages
RPH Mggu 37 2022
No ratings yet
RPH Mggu 37 2022
8 pages
Conditional Probability and - Independence
No ratings yet
Conditional Probability and - Independence
41 pages
Gemini UMX-3
No ratings yet
Gemini UMX-3
15 pages
BSBTWK502 Project Portfolio Student - Template.v1.0
No ratings yet
BSBTWK502 Project Portfolio Student - Template.v1.0
8 pages
Domestic Water Calculations: Demand Weight of Fixtures in Fixture Units
No ratings yet
Domestic Water Calculations: Demand Weight of Fixtures in Fixture Units
2 pages
Manual Del ABB 53SL6000
100% (1)
Manual Del ABB 53SL6000
138 pages
OB Practical Questionnaire KG Agarwal
No ratings yet
OB Practical Questionnaire KG Agarwal
5 pages
The Elements of Journalism - Book Review
No ratings yet
The Elements of Journalism - Book Review
4 pages
Shravani S-Research Paper
No ratings yet
Shravani S-Research Paper
8 pages
Inisiasi Menyusui Dini
No ratings yet
Inisiasi Menyusui Dini
14 pages
NY State License Endoresments
No ratings yet
NY State License Endoresments
1 page
297 753 2 PB
No ratings yet
297 753 2 PB
15 pages
11091/bhuj Pune SPL Sleeper Class (SL) : WL WL
No ratings yet
11091/bhuj Pune SPL Sleeper Class (SL) : WL WL
2 pages
M09 Wolf57139 03 Se C09
No ratings yet
M09 Wolf57139 03 Se C09
40 pages