Character Functions :
—————- Functions that change the case of characters—————–
Data ds;
Infile datalines;
Input Name&$15. Sex$ Age Height Weight;
Datalines;
Alfred Carter M 14 69 112.5
Alice Davis F 13 56.5 84
Barbara Smith F 13 65.3 98
Carol F 14 62.8 102.5
Henry M 14 63.5 102.5
James Kennedy M 12 57.3 83
Jane F 12 59.8 84.5
Janet Smit F 15 62.5 112.5
Jeffrey Truman M 13 62.5 84
John Killer M 12 59 99.5
Joyce Free F 11 51.3 50.5
;
Run;
UPCASE
Converts all letters in an argument to upper case
Syntax:-Upcase(string)
Example:-
Data ds1;
Set ds;
Name1=Upcase(Name);
Run;
/*converting letters into upper case and keeping them same column for same order like base dataset*/
Data ds2;
Set ds;
Name=Upcase(Name);
Run;
/*creating different column with uppercase letters*/
Data ds3(drop=Name);
Set ds;
Name1=Upcase(Name);
Run;
LOWCASE
Converts all letters in an argument to lowercase
Syntax:-lowcase(string)
Example:-
Data ds4;
Set ds2;
Name1=lowcase(Name);
Gender=lowcase(sex);
Run;
PROPCASE
Converts all words in an argument to proper case (like I Am Rajesh)
Propcase means In a word first letter is capital rest of all small
Syntax:-Propcase(string)
Example:-
Data ds5;
Set ds2;
Name1=propcase(Name);
Run;
Data allcase;
a=lowcase(‘THIS IS A DOG’);
b=propcase(a);
c=propcase(lowcase(‘THIS IS A DOG’));
d=Upcase(‘this is a dog’);
Put a=;
Put b=;
Put c=;
Put d=;
Run;
————————-Functions that extract part of strings———————-
Data ds;
Infile datalines;
Input idno name&$18. Team $ strtwght endwght ;
Datalines;
1331 Jason Schock Long blue 187 172
1067 Kanoko Nagasaka green 135 122
1251 Richard Rose blue 181 166
1192 Charlene Armstrong yellow 152 139
1352 Bette Long Schock green 156 137
1262 Yao Chen Garg blue 196 180
1124 Adrienne Fink green 156 142
;
Run;
SCAN
Selects a given word from a character expression
Selects particular word from character string
Syntax:- SCAN(string ,n<, delimiter(s)>)
Examples:-
Data scn1;
Set ds;
New_Name=scan(Name,2);
Run;
Data scn2;
Set ds;
Name1=scan(Name,1);
Name2=scan(Name,2);
Name3=scan(Name,3);
Run;
Data scn3;
Set ds;
New_Name=scan(Name,-3);
Run;
Data scn4;
Set ds;
New_Name=scan(Name,-1);
Run;
Data scn5;
A=’Madan,Mohan,Moorthy’;
B=Scan(A,2,’,’);
Run;
SUBSTR
Takes substrings of matrix elements
Selects particular part from character string
Syntax:- SUBSTR( matrix, position<, length>)
Examples:-
Data sbstr1;
Set ds;
New_Name=substr(Name,1,5);
Run;
Data sbstr2;
a=’rajesh’;
b=substr(a,8,5);
Run;
Data sbstr3;
a=’rajesh’;
Substr(a,1,7)=’madhan’;
Run;
Data sbstr4;
a=put(today(),date9.);
b=substr(a,3,3);
Run;
——–Functions that combines two or more strings together strings———
CAT
Concatenates character strings without removing leading or trailing blanks
Syntax:- CAT(string-1 <, … string-n>)
Data cat1;
a=’ The Olym’;
b=’pic Arts Festi’;
c=’ val includes works by D ‘;
d=’ale Chihuly.’;
Result=cat(a,b,c,d);
Put result $char.;
Run;
CATT
Concatenates character strings and removes trailing blanks
Syntax:-CATT(string-1 <, …string-n>)
Data cat2;
a=’ The Olym’;
b=’pic Arts Festi’;
c=’ val includes works by D ‘;
d=’ale Chihuly.’;
Result=catt(a,b,c,d);
Put result $char.;
Run;
CATS
Concatenates character strings and removes leading and trailing blanks
Syntax:-CATS(string-1 <, …string-n>)
Data cat3;
a=’ The Olym’;
b=’pic Arts Festi’;
c=’ val includes works by D ‘;
d=’ale Chihuly.’;
Result=cats(a,b,c,d);
Put result $char.;
Run;
CATX
Concatenates character strings, removes leading and trailing blanks, and inserts separators
Syntax:-CATX(separator, string-1 <, …string-n>)
Data cat4;
a=’The Olympic’;
b=’Arts Festival’;
c=’includes works by’;
d=’Dale Chihuly.’;
Result=catx(‘***’,a,b,c,d);
Put result $char.;
Run;
Data cat5;
Separator=’%%$%%’;
a=’ The Olym’;
b=’pic Arts Festi’;
c=’ val includes works by D ‘;
d=’ale Chihuly.’;
Result=catx(separator,a,b,c,d);
Put result $char.;
Run;
——————Functions that remove blanks from string—————-
LEFT
Left aligns a SAS character expression
Syntax :- LEFT(string)
Data remblank1;
a=’ My Name Is Ram’;
b=left(a);
Run;
RIGHT
Right aligns a character expression
Syntax :- RIGHT(string)
Data remblank2;
a=’My Name Is Ram ‘;
b=right(a);
Run;
TRIM
Removes trailing blanks from character expressions and returns one blank if the expression is missing
Syntax :- TRIM(string)
Data remblank4;
Input part1 $ 1-10 part2 $ 11-20;
hasblank=part1||part2;
noblank=part1||part2;
Put hasblank;
Put noblank;
Datalines;
apple sauce
;
Run;
Data remblank5;
Input part1$ part2$ ;
hasblank=part1||part2;
noblank=trim(part1)||part2;
Put hasblank;
Put noblank;
Datalines;
apple sauce
;
Run;
Data remblank5;
Input part1$1-8 part2$9-13 part3$15-20 ;
hasblank=part1||part2||part3;
noblank=trim(part1)||trim(part2)||part3;
Put hasblank;
Put noblank;
Datalines;
apple sauce mixer
orange hand
;
Run;
Data remblank6;
x=” “;
y=”>”||trim(x)||”<“;
Put y;
Run;
TRIMN
Removes trailing blanks from character expressions and returns a null string (zero blanks) if the expression is missing
Syntax :- TRIMN(string)
Data remblank6a;
x=” “;
z=”>”||trimn(x)||”<“;
put z;
Run;
Difference between Trim and Trimn
When expression is there both Trim and Trimn gives same result,
If expression is missing Trim returns one blank space but Trimn gives zero blank space.
STRIP
Returns a character string with all leading and trailing blanks removed
Syntax :- STRIP(string)
Data remblank3;
Input string $char8.;
Original = ‘*’ || string || ‘*’;
Stripped = ‘*’ || strip(string) || ‘*’;
Datalines;
abcd
abcd
abcd
abcdefgh
x y z
;
Run;
COMPRESS
Removes specific characters from a character string
Returns a character string with specified characters removed from the original string.
Syntax :- COMPRESS(<source><, chars><, modifiers>)
Examples:-
Compressing Blanks
Data remblank7;
a=’ AB C D ‘;
b=compress(a);
Run;
Data remblank7a;
x=’1 2 3 4 5′;
y=compress(x);
Put y;
Run;
Compressing Lowercase Letters
Data remblank7b;
x=’123-4567-8901 B b 234-5678-9012 c’;
y=compress(x,’abcd’);
put y;
Run;
Data remblank7c;
x=’123-4567-8901 B b 234-5678-9012 c’;
y=compress(x,’abcd’, ‘I’);
put y;
Run;
Compressing Upper case Letters
Data remblank7B;
x=’123-4567-8901 B b 234-5678-9012 c’;
y=compress(x,’ABCD’);
put y;
Run;
Data remblank7B;
x=’123-4567-8901 B b 234-5678-9012 c’;
y=compress(x,’ABCD’, ‘I’);
Run;
Compressing Space Characters
Data remblank7c;
x=’1 2 3 4 5′;
y=compress(x,,’s’);
Run;
Keeping Characters in the List
Data remblank7d;
x=’Math A English B Physics A’;
y=compress(x,’ABCD’,’k’);
Run;
Data remblank7d;
x=’Math a English b Physics a’;
y=compress(x,’abcd’);
Run;
Data remblank7d;
x=’Math a English b Physics a’;
y=compress(x,’abcd’,’k’);
put y;
Run;
Separating Numbers Or Text from Alphanumeric data
Data ds;
Infile datalines;
Input NAME $20.;
Datalines;
sas123sap
java456oracle
;
Run;
Data ds1;
Set ds;
a=compress(name,’ ‘,’a’);
Run;
Data ds2;
Set ds;
a=compress(name,’ ‘,’AK’);
Run;
data _null_ ;
string=’StudySAS Blog! 17752. ‘ ;
string1=compress(string,”) ; *Compress spaces. This is default
string2=compress(string,”,’ak’);*Compress alphabetic chars(1,2etc)
string3=compress(string,”,’d’) ; *Compress numerical values
string4=compress(string,”,’l’);*Compress lowercase characters
string5=compress(string,”,’u’);*Compress uppercase characters
string6=compress(string,’A’,’k’);*Keeps only specified characters
string7=compress(string,’!.’,’P’);*Compress Punctuations only(ALL SPECIAL CHARACTER)
string8=compress(string,’s’,’i’);*upper/lower case specified characters
string9=compress(string,”,’a’);*Compress all upper\lower case characters
string10=compress(string,”,’s’) ; * Compress or delete spaces
string11=compress(string,”,’kd’) ; *Compress alphabets (Keeps only digits)
put string1= ;
put string2= ;
put string3= ;
put string4= ;
put string5= ;
put string6= ;
put string7= ;
put string8= ;
put string9= ;
put string10=;
put string11=;
run ;
COMPBL
Removes multiple blanks from a character string.
Syntax:-Compbl(source)
Data remblank9;
x=’my name is ram’;
y=compbl(x);
Run;
Data remblank9a;
x=’My ‘;
y=’ Name ‘;
z=’ is Ram’;
a=x||y||z;
b=compbl(a);
Run;
Data ds1;
Infile datalines;
Input id$ fname$ lname$ sal;
Datalines;
001 mohan arisela 60000
002 padma narni 45000
003 varma maddina 50000
;
Run;
Data remblank10;
Set ds1;
Name1=fname||lname;
Name2=cat(fname,lname);
Name2a=cat(trim(fname),lname);
Name3=compbl(fname||lname);
Run;
————-Functions that substitute letters or words in string—————-
TRANSLATE
Replaces specific characters in a character expression
Data trns1;
x=translate(‘XYZW’,’AB’,’VW’);
Put x;
Run;
Data trns2;
x=translate(‘abc’,’sh’, ‘cg’);
Put x;
Run;
TRANWRD
Replaces or removes all occurrences of a word in a character string
Syntax:-TRANWRD(source,target,replacement)
Data trnw1;
name=’Mrs.Krishna’;
name1=tranwrd(name, “Mrs.”, “Mr.”);
put name name1;
run;
Data trnw2;
Infile datalines;
Input salelist $;
target=’FISH’;
replacement=’NIP’;
salelist1=tranwrd(salelist,target,replacement);
Datalines;
CATFISH
;
Run;
Data trnw2a;
Infile datalines;
Input salelist $;
length target $10 replacement $3;
target=’FISH’;
replacement=’NIP’;
salelist1=tranwrd(salelist,target,replacement);
Datalines;
CATFISH
;
Run;
The LENGTH statement left-aligns TARGET and pads it with blanks to the length of 10.
This causes the TRANWRD function to search for the character string ‘FISH ‘ in SALELIST Because the search fails, this line is written to the SAS log: CATFISH
You can use the TRIM function to exclude trailing blanks from a target or replacement variable. Use the TRIM function with TARGET
Data trnw2b;
Infile datalines;
Input salelist $;
length target $10 replacement $3;
target=’FISH’; replacement=’NIP’;
salelist1=tranwrd(salelist,trim(target),replacement);
Datalines;
CATFISH
;
Run;
——————Functions that searches for characters——————-
INDEX
Searches a character expression for a string of characters
Syntax:- INDEX(source, excerpt)
Data ind1;
a=’ABC.DEF (X=Y)’;
b=’D’;
x=index(a,b);
Put x;
Run;
Data ind2;
a=’ABC.DEF (X=Y)’;
b=’X=Y’;
x=index(a,b);
Put x;
Run;
Data ind3;
Infile datalines;
input name $ 1-12 age;
Datalines;
Harvey Smith 30
John West 35
Jim Cann 41
James Harvey 32
Harvy Adams 33
;
Run;
Now, let’s use the index function to find the cases with “Harvey” in the name
Data ind3a;
Set ind3;
x = index(name, “Harvey”);
Run;
INDEXC
Searches a character expression for special characters, and returns the position of the characters
Syntax:-INDEXC(source,excerpt-1<,… excerpt-n>)
Data indc1;
a=’ABC.DEP (X2=Y1)’;
x=indexc(a,’.’);
Run;
Data indc2;
a=’ABC.DEP (X2=Y1)’;
b=’=’;
x=indexc(a,b);
Run;
INDEXW
Searches a character expression for a specified string as a word
Syntax:- INDEXW(source, excerpt<,delimiter>)
Data indw1;
s=’asdf adog dog’;
p=’dog’;
x=indexw(s,p);
Run;
Data indw2;
s=’abcdef x=y’;
p=’def’;
x=indexw(s,p);
Run;
—————————– Other Character Functions ————————–
LENGTH
Returns length of string
Syntax:- LENGTH(string)
Data len;
a=’Mr.Krishna’;
b=length(a);
Run;
REVERSE
Returns string in reverse order
Syntax:- REVERSE(string)
Data rev;
a=’Mr.Krishna’;
b=reverse(a);
Run;
QUOTE
Ads double quotes to character values
Syntax:- QUOTE(string)
Data quot1;
a=’Mr.Krishna’;
b=quote(a);
Run;
DEQUOTE
Removes double quotes to character values
Syntax:- DEQUOTE(string)
Data quot2;
Set quot1;
c=dequote(a);
Run;
Data quot3;
Infile datalines;
Input id name$ sal;
Datalines;
001 abc 5000
002 def 6000
003 xyz 7000
;
Run;
Data quot3a;
Set quot3;
name1=quote(name);
name2=quote(trim(name));
name3=dequote(name2);
Run;
RANK
Returns the position of a character in the ASCII or EBCDIC collating sequence.
Syntax:-RANK(x)
The RANK function returns an integer that represents the position of the first character in the character expression. The result depends on your operating environment.
Data rnk1;
Infile datalines;
Input id name$ sal;
Rank_var=RANK(name);
Datalines;
001 clarc 5000
002 def 4000
003 clark 7000
;
Run;
Data rnk2 ;
a=Rank(‘A’);
b=Rank(‘krishna’); /* It gives position of first character only*/
Run;
REPEAT
Returns a character value that consists of the first argument repeated n+1 times.
Syntax:- Repeat(Argument,n)
Data rep;
Infile datalines;
Input id name$ sal;
x=repeat(name,10);
Datalines;
001 clarc 5000
002 def 4000
003 clark 7000
;
Run;
SOUNDEX
Encodes a string to facilitate searching.
Encodes a string and gives same result for same pronunciation strings in variable
Syntax:- SOUND(Argument)
Data snd;
Infile datalines;
Input id name$ sal;
y=soundex(name);
Datalines;
001 clarc 5000
002 def 4000
003 clark 7000
;
Run;
COLLATE
Returns a character string in ASCII or EBCDIC collating sequence.
Syntax:- (start-position<,end-position>) | (start-position<,,length>)
Data col1;
x=collate(45,99);
put @1 x ;
Run;
Data col2;
x=collate(1,,49);
put @1 x ;
Run;
ASCII Result
Data col3;
x=collate(48,,10); /*start-position<,,length*/
y=collate(48,57); /*start-position<,end-position */
put @1 x @14 y;
Run;
EBCIDIC Result
Data col4;
x=collate(240,,10); /*start-position<,,length*/
y=collate(240,249); /*start-position<,end-position */
put @1 x @14 y;
Run;
The maximum end-position for the EBCDIC collating sequence is 255.
ASCII collating sequences, end-position values between 0 and 127