**Introduction to Stata **This file serves as a companion to the Stata Guide provided for the lecture. **Both can be accessed on EdShare along with the mini-data sets used in the lecture. **The Stata Interface *Setting a Working Directory *Drop-down menu or code cd "C:\Users\Nicco\Documents\Talks\Introduction_to_Stata" **Do-files. *How to Open *What to use for: Data Cleaning/ Model code *What are annotations *Clean files **Log Files *Why? *Drop-down menu *Stata to open it log using "C:\Users\Nicco\Documents\Talks\Introduction_to_Stata\log_file.smcl" **Open a data set in Stata *Drop down menu: open or import *Open directly from a file folder *pathway, if known use "C:\Users\Nicco\Documents\Talks\Introduction_to_Stata\data_1.dta" **Viewing Data set browse edit *Variable details *you can observe the variabel details in the open window *You can edit the details in the window, or later we discuss coding *Merging Data sets *merge: based on unique identifiers merge 1:1 ID using "C:\Users\Nicco\Documents\Talks\Introduction_to_Stata\data_2.dta" *look at new data set *append: based on same variable codings across datasets append using "C:\Users\Nicco\Documents\Talks\Introduction_to_Stata\data_3.dta" *look at new dataset *Create new variables gen age_Months=age*12 *look at new variable sum age_Months *Adapt variable descriptors label variable age_Months "Age in months" *Adapt or define/assign data value labels label define education 1"low" 2"medium" 3"high" *notice this has not changed anything in the datset yet label value education_level_simple education *Command structure for univaraite analysis *typically a command is issued first then the variable you with to apply it to di 3+7 *most basic command to use Stata like a calculator *order of operations is important describe sum age polattention sum age, detail *the detail is an optional command that is not needed for the basic code *the option tells Stata you want more tab partyid *sometimes data requires different formats for assessment based on the type of data you have *Data visualisations *basic commmand set-up is the type of graph, the y-axis variable, the x-axis variable *bar chart hist leftright, frequency discrete width(.5) *why all the optional commands? hist leftright, percent discrete width(.5) *Installing packages *basic Stata does not incldue all the optional packages you could use *install as you learn you need them, 1 time only *ssc install catplot catplot leftright, recast(bar) *histogram hist age, frequency hist age, freq *short-hand used when unique string of letters for command hist age, bin(5) hist age, width(3) *Graph commands *there are a starnd set of graph commands that can be used, *for details see the help file for "graph" graph box age graph box age, over(education_level_simple) *Editing Graphs in the editor hist age, freq *Basic Commands to edit in the code hist age, freq ytitle(Number of Respondents) xtitle(Age of Respondents) title(Distribution of Respondent Ages) *Set scheme *why do this? set scheme s1mono hist age, freq ytitle(Number of Respondents) xtitle(Age of Respondents) title(Distribution of Respondent Ages) *Test commands ttest age==32 ttest age, by(location) unequal *notice we now include two optional commands here *Confidence interval change-optional *default set at 95% ttest age==50 ttest age==50, level(90) ttest age==50, level(99) *Bivariate commands tab partyid leftright *Optional commands work for these as well tab partyid leftright, chi2 *graphical displays for bivariate relationships graph twoway (scatter polattention age) *graph command allows you to layer graphs graph twoway (scatter polattention age) (lfit polattention age) graph twoway (scatter polattention age) (lfit polattention age) (lowess polattention age) *Correlation *some statistical techniques have multiple ways to be run in Stata corr age polattention gross_personal_income pwcorr age polattention gross_personal_income, sig **Modelling Relationships reg polattention education_level_simple leftright partyid age mlogit partyid education_level_simple leftright age gross_personal_income *help files help mlogit *Post-estimation commands *model fit estat ic