SAS Avanzado
JorgEnriquez30915 de Enero de 2015
7.700 Palabras (31 Páginas)408 Visitas
CAPITULO 1
RELACION DE TERMINOS
PROCESAMIENTO
DE DATOS SAS SQL
Archivo Archivo de datos SAS Tabla
Registro Observación Fila
Campo Variable Columna
PROC SQL EN RELACION A OTROS PROCEDIMIENTOS
• Las sentencias en PROC SQL se componen de cláusulas.
• PROC SQL no requiere la sentencia RUN.
• Bajo un PROC SQL se pueden ejecutar varias sentencias.
• Si PROC SQL no ha terminado, el estado del sistema despliega “PROC SQL running”.
• El orden de las cláusulas dentro de una sentencia SELECT es importante.
• Un PROC SQL step que contiene una o más sentencias SELECT se denomina una consulta PROC SQL. La sentencia SELECT es solamente una de varias sentencias que puede ser usada con PROC SQL.
• A PROC SQL query produces a result set that can be output as a report, a table, or a PROC SQL view.
OTROS
cláusula SELECT es la primera cláusula de la sentencia SELECT, permite especificar una o más columnas las cuales pueden ser columnas existentes o creadas
cláusula FROM Se especifica la tabla a ser consultada
cláusula WHERE Para seleccionar observaciones en base a una condición
cláusula ORDER BY El orden de las filas en la salida de una consulta PROC SQL
cláusula GROUP BY en el PROC SQL step para resumir los datos en grupos. Las funciones de resumen producen un resumen estadístico para cada grupo que es definido en la cláusula GROUP BY.
sentencia CREATE TABLE Para crear una nueva tabla a partir de los resultados de una consulta
cláusula HAVING Para refinar una consulta PROC SQL que contiene una cláusula GROUP BY. Esta cláusula restringe los grupos que son desplegados.
use an asterisk (*) in the SELECT clause To display all columns in the order in which they are stored in the table
use the keyword DISTINCT in the SELECT clause To eliminate duplicate rows from your query results
opción FEEDBACK de la sentencia PROC SQL en asociamiento con SELECT * muestra la lista de columnas seleccionadas en el SAS log
opción OUTOBS= de la sentencia PROC SQL indica el número máximo de observaciones a desplegarse
opción INOBS= de la sentencia PROC SQL indica el número máximo de observaciones a leerse
Operadores en sentencia WHERE Sentencia WHERE de PROC SQL igual a sentencia WHERE de DATA Step
IS MISSING or IS NULL operador Para sentencia WHERE
LABEL= Opción de la cláusula SELECT
FORMAT= Opción de la cláusula SELECT
If a summary function specifies one column as argument the calculation is performed down the column
If a summary function specifies multiple columns as arguments the calculation is performed across columns for each row
If a GROUP BY clause is not present in the query PROC SQL applies the function to the entire table
If a GROUP BY clause is present in the query PROC SQL applies the function to each group specified in the GROUP BY clause
COUNT(*) Para contar el número total de filas en un grupo o una tabla
COUNT(COLUMN) Para contar el número total de filas en un grupo o una tabla para el cual hay valores no faltantes en la columna COLUMN
COUNT(DISTINCT COLUMN) Para contar el número total de valores únicos en la columna COLUMN
A subquery noncorrelated a self-contained subquery that executes independently of the outer query
A subquery correlated a dependent subquery that requires one or more values to be passed to it by the outer query before the subquery can return a value to the outer query
Using Single-Value Noncorrelated Subqueries Use operators lt, gt, eq, le, ge and negations of those
Using Multiple-Value Noncorrelated Subqueries Use operators IN, ANY, ALL, EXISTS
Comparison > ANY Outer Query Selects values that are greater than any value returned by the subquery
Comparison < ANY Outer Query Selects values that are less than any value returned by the subquery
Comparison = ANY Outer Query Selects values that are equal to any value returned by the subquery
Comparison > ALL Outer Query Selects values that are greater than all value returned by the subquery
Comparison < ALL Outer Query Selects values that are less than all value returned by the subquery
Condition EXISTS It is true if the subquery returns at least one row
Condition NOT EXISTS It is true if the subquery returns no data
VALIDATE keyword before a SELECT statement You specify the VALIDATE keyword just before a SELECT statement. SAS checks the syntax of present query for accuracy but does not execute her
An inner join It combines and displays only the rows from the first table that match rows from the second table, based on the matching criteria (also known as join conditions) that are specified in the WHERE clause
Table Alias a temporary, alternate name for any or all tables in any PROC SQL query
type of outer join Left, Right, Full
2 You can perform an outer join on only two tables or views at a time
Left Outer Join It retrieves all rows that match across tables, based on the specified matching criteria (join conditions), plus nonmatching rows from the left table (the first table specified in the FROM clause).
Right Outer Join It retrieves all rows that match across tables, based on the specified matching criteria (join conditions), plus nonmatching rows from the right table (the second table specified in the FROM clause).
Full Outer Join It retrieves both matching rows and nonmatching rows from both tables.
COALESCE Function It overlays the specified columns
in-line view It is a nested query that is specified in the outer query's FROM clause
set operation It is a SELECT statement that contains two groups of query clauses (each group beginning with a SELECT clause), a set operator, and, optionally, one or both of the keywords ALL and CORR (CORRESPONDING).
multiple set operations When processing multiple set operators, PROC SQL follows a default order of precedence, unless this order is overridden by parentheses in the expression(s). By default, INTERSECT is evaluated first. OUTER UNION, UNION, and EXCEPT all have the same level of precedence. In equal precedence processing starts from first two queries.
EXCEPT It selects unique rows from the first table that are not found in the second table and, it overlays columns based on their position in the SELECT clause without regard to the individual column names.
INTERSECT It selects unique rows that are common to both tables and, it overlays columns based on their position in the SELECT clause without regard to the individual column names.
UNION It selects unique rows from one or both tables and, it overlays columns based on their position in the SELECT clause without regard to the individual column names.
OUTER UNION It selects all rows from both tables and, it does not overlay columns.
ALL It makes only one pass through the data and does not remove duplicate rows. You do not care if there are duplicates.
CORR (or CORRESPONDING) It compares and overlays columns by name instead of by position. When used with EXCEPT, INTERSECT, and UNION, removes any columns that do not have the same name in both tables. When used with OUTER UNION, overlays same-named columns and displays columns that have nonmatching names without overlaying. If an alias is assigned to a column in the SELECT clause, CORR will use the alias instead of the permanent column name.
Method of Creating Table • create an empty table by defining columns
• create an empty table that is like (has the same columns and attributes as) an existing table
• create a populated table (a table with both columns and rows of data) from a query result
data-type It is enclosed in parentheses and specifies one of the following: CHARACTER (or CHAR) | VARCHAR | INTEGER (or INT) | SMALLINT | DECIMAL (or DEC) | NUMERIC (or NUM) | FLOAT | REAL | DOUBLE PRECISION | DATE
data-type column-width which is enclosed in parentheses, is an integer that specifies the width of the column
column-modifier It is one of the following: INFORMAT= | FORMAT= | LABEL= . More than one column-modifier may be specified
column-constraint It specifies an integrity constraint
MESSAGE= and MSGTYPE= They specify an error message that is related to an integrity constraint
DESCRIBE TABLE statement • To display a list of columns and column attributes for one or more tables in the SAS log, regardless of whether the tables contain rows of data
• To display a CREATE INDEX statement in the SAS log for each index that is defined for one or more specified tables
Method of Inserting Row Example • insert values by column name by using the SET clause
• insert lists of values by using the VALUES clause
• insert rows that are copied from another table by using a query result
Constraint CHECK Ensures that a specific set or range of values are the only values in a column. It can also check the validity of a value in one column based on a value in another column within the same row
Constraint NOT NULL Guarantees that a column has non-missing values in each row
Constraint UNIQUE / DISTINCT Enforces uniqueness for the values of a column
Constraint PRIMARY KEY Uniquely defines a row within a table, which can be a single column or a set of columns. A table can have only one PRIMARY KEY. PRIMARY KEY includes the attributes of the constraints NOT NULL and UNIQUE
Constraint
...