- The function is to store the current obs. PDV (Program Data Vector) is a logical area in memory where SAS creates a dataset one observation at a time. When SAS processes a data step it has two phases. Compilation phase and execution phase. During the compilation phase the input buffer is created to hold a record from external file. After input buffer is created the PDV is created. The PDV is the area of memory where SAS builds dataset, one observation at a time. The PDV contains two automatic variables _N_ and _ERROR_.The Logical Program Data Vector (PDV) is a set of buffers that includes all variables referenced either explicitly or implicitly in the DATA step. It is created at compile time, then used at execution time as the location where the working values of variables are stored as they are processed by the DATA step program
2. Difference between IF and Where:
- IF : it can only be used in a DATA step. Many IF statements can be used in one DATA step. Must read the record into the program data vector to perform selection with IF
- WHERE: can be used in a DATA step as well as a PROC. A second WHERE will replace the first unless the option ALSO is used. Data subset prior to reading record into PDV.
- In addition to the %LET statement, other features of the macro language that create macro variables are:
- iterative %DO statement
- %GLOBAL statement
- %INPUT statement
- INTO clause of the SELECT statement in SQL
- %LOCAL statement
- %MACRO statement
- SYMPUT routine and SYMPUTN routine in SCL
- %WINDOW statement
4. How to convert rows to columns/ columns to rows in SAS
- Proc TRANSPOSE is used to to convert the data
5. What is the difference between using drop = data set option in data statement and set statement?
- If you do not want to process certain variables and do not want them to appear in the new dataset, then mention the drop= option in the SET statement.
- Example: set datasetname(drop= var1 var2);
- If you want to process certain variables and do not want them to appear in the output dataset, then mentopn the drop= option in the DATA statement.
- Example: Data newdatasetname(drop=var1 var2);
6. What is the difference between "+" operator and SUM function?
- "+" operator returns the output as missing if there are any missing values in the data
- Example: Y= 3 + . + 2 output: Y=.
- SUM function returns the sum of non missing values even if there are missing values in the data
- Example: Y= sum(3 , . ,2) output: Y=5
7. How many datatypes are there in SAS?
- We have 2 datatypes in SAS:
- Numeric
- Character
- Date is considered as numeric datatype.
8. How to remove duplicate observations in SAS?
- Here are the few techniques to remove duplicates:
- Using 'Nodup' or 'NodupKEY' option in proc sort
- Example: Proc sort data=datasetname Nodup; by var1; Run;
- Using first. and last. option
- Example:
data datasetname;set inputdata;by id;if first.id and last.id;run;
9. What is the difference between input and put function in SAS?
- Put : converts numeric to character
- Example:
data put_function;
pincode_num= 123456;
pincode_char= put(pincode_num, 6.);
run;
- Input : converts character to numeric
- Example:
data input_function;
salary_char= '12345678';
salary_num= input(salary_char, 8.);
run;
10. What is the default length while using scan function?
- 200 length
11. Name few SAS functions that you worked on?
- substr
- find
- intnx
- intck
- index
- catx
- sun
- alnum
- scan
12. How do SAS dates work?
- SAS date is stored as numeric value. Jan 1, 1960 has the sas date value as 0. Any date after this is the no. of days from this date. Example, Jan 1, 1961 is 366.
13. What are the default option of proc means?
- n, mean, minimum, maximum, standard deviation
14. What the few procedures that you worked on?
- Proc SORT : Sorts the data by the variable mentioned in the by statement. Using nodupkey, we can remove duplicates.
- Proc APPEND : Adds one dataset to another. 'force' option to be used when the variables are of different length in two datasets
15. What is the difference between nodupkey and nodup in sort procedure?
- The identical observations are checked and removed through NODUP option. NODUPKEY option checks for all BY variable values and if found, it will eliminate that.
16. What is the difference between VAR V1 – V3 and VAR V1 -- V3?
- VAR V1 - V3 would return V1, V2 and V3 variables
- VAR V1 -- V3 would return all the variables between V1 and V3.
- For example, there are 5 variables i.e. V1 name id V2 V3, then all the variables between V1 and V3 would be returned
17. What is the difference between format and informat?
- An informat is a specification for how raw data should be read.
- A format is a layout specification for how a variable should be printed or displayed.
18. Explain why double trailing @@ is used in Input Statement?
- During data step iteration, including double trailing @@ in Input statements implies that SAS should hold the current record for the purpose of execution of next Input statement rather than switching onto the new record
19. Explain data _null_?
- DATA statement processes all statements within the DATA step without dataset creation
20. What is the difference between Proc MEANS and Proc SUMMARY?
- Proc Means: This procedure produces the printed report by default in the OUTPUT window. By default take all the numeric variables in the analysis.
- Proc Summary: This procedure includes the PRINT in the statement to produce the printed report. It takes the variables into the statistical analysis that are described in VAR statement.
21. Mention SAS system options to debug SAS macros.
- MLOGIC
- MPRINT
- SYMBOLGEN
22. What is the difference between SYMPUT and SYMGET?
- SYMPUT: used for storing the value of a data set into the macro variable.
- SYMGET: used for retrieving the value from the macro variable to the data set.
23. What are the programming errors that you committed?
- Not checking log after submitting program
- Missing semicolon
- run statement instead of quit in proc sql
24. What is the difference between the SAS DATA STEP and SAS PROCs?
- SAS DATA STEP is used to read in and manipulate data.
- SAS PROCs are sub-routines perform tasks on SAS data set.
25. what is the use of %include statement?
- %INCLUDE statement reads an entire file into the current SAS program you are running and submits that file to the SAS System immediately.
No comments:
Post a Comment