DEPARTMENT OF COMPUTING AND MATHEMATICS

(TSING YI)

HIGHER DIPLOMA IN APPLIED STATISTICS AND COMPUTING

 

 

 

 

Unit Name: Statistical Computing

Unit Number: CMFD428

Overall Assessment: Continuous Assessment 100%

Assignment Number: 3

This Assignment: 40% (of 100%)

Hand-out Date: 26 Mar, 1999 (Friday)

Hand-in Date: on 27 April, 1999 (Tuesday) during the lecture

 

 

Guidance:

 

PART A (data management) – 15 marks

  1. List FIVE different data combining methods.
  2. Suggest ONE real life example for each combining method and give the corresponding SAS program (you can create your own datasets for the combinations, if necessary).

 

PART B (marketing research) – 30 marks

There is a dataset with 100 observations with variables OCCUPATIONS, SEX and INCOME on the server q:\user\practice\hdasc2\sc\a3_partb.sd2.

You are required to draw a sample from this dataset of size equal to 20 in the following manner.

i) Draw a simple random sample of size

(a) approximately 20;

(b) exactly 20.

ii) Draw a systematic sample

(a) without a random start and the first observation is drawn;

(b) with a random start.

iii) Draw a stratified random sample with OCCUPATIONS as the strata.

Hints: Count the number of observations in each occupation.

Select the proportionate sample size (approximately) in each occupation (i.e. stratum).

iv) Draw a quota sample with the control element being Male, i.e., taking the first 10 male respondents.

 

PART C (simulation and time series) – 25 marks

  1. Randomly generate 365 observations (days) which follow a normal distribution with mean equal to 10 and standard deviation equal to 5.
  2. Based on the data in i), use SAS macro to calculate a n day simple moving average where n and the starting date are input parameters.

 

PART D (ANOVA) – 30 marks

Two groups of children, one group with Attention Deficit Disorder (ADD) and the other (control) group without ADD, were randomly given either the drug PLACEBO or the drug RITALIN. A measure of activity was made on all the children with the results shown in the table below (Higher Numbers indicate more activity).

 

GROUP

DRUG

ACTIVITY

ADD

PLACEBO

90

ADD

PLACEBO

88

ADD

PLACEBO

95

CONTROL

PLACEBO

60

CONTROL

PLACEBO

62

CONTROL

PLACEBO

66

ADD

RITALIN

72

ADD

RITALIN

70

ADD

RITALIN

64

CONTROL

RITALIN

86

CONTROL

RITALIN

86

CONTROL

RITALIN

82

  1. Perform a two-way ANOVA (GROUP by DRUG) analysis on these data. Use a Duncan multiple-range test to determine any significant main effects. Draw an interaction plot like the following one (Use OPTIONS PS=25 LS=80;).
  2. Refer to the hardcopy as the following diagram cannot be fixed in html format.

    Plot of ACTIVITY*DRUG. Symbol is value of GROUP.

    ACTIVITY ‚

    100 ˆ

    ‚ A

    ‚ A

    ‚ A C

    ‚ C

    80 ˆ

    ‚ A

    ‚ A

    ‚ C

    ‚ C A

    60 ˆ C

    Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ

    PLACEBO RITALIN

    DRUG

    NOTE: 1 obs hidden.

  3. Write the SAS statements to conduct a t-test between drugs for the ADD and control children respectively.
  4. An alternative approach to i) is to run a one-way ANOVA with each combination of GROUP and DRUG as levels of the independent variable. That is, the four levels would be ADD-PLACEBO, ADD-RITALIN, CONTROL-PLACEBO, and CONTROL-RITALIN. Write a DATA step that creates a new variable (call it LEVEL) which define each of these four groups. Then perform a one-way ANOVA analysis using this new variable as the independent (CLASS) variable. Include the Duncan multiple-range test in your analysis as well.