f



Deleting characters from multiple string variables

Hello,
I have 360 string variables that all look like this:

Variable name: DoR_Tm1_Visit
Responses
Days_OoR22
Days_OoR4
Days_OoR3
Days_OoR2
Days_OoR18
Days_OoR2

(responses go from 1 to 30 and have missing data)

I want to remove the text and just have the number
DoR_Tm1_Visit
22
4
3
2
18
2

I have used this syntax:

string DoR_Tm1_Visit _Num(a20).
COMPUTE DoR_Tm1_Visit_Num = NUMBER(SUBSTR(DoR_Tm1_Visit,9),f2).
EXECUTE .

Which works, but it is clunky and coming up with an error. Also it is taking half a day to run for all the variables and my huge data set


Could do with some help on this one!
 thanks
0
Erin
10/17/2016 9:28:36 PM
comp.soft-sys.stat.spss 5679 articles. 0 followers. Post Follow

11 Replies
278 Views

Similar Articles

[PageSpeed] 50

On Monday, October 17, 2016 at 5:28:39 PM UTC-4, Erin wrote:
> Hello,
> I have 360 string variables that all look like this:
>=20
> Variable name: DoR_Tm1_Visit
> Responses
> Days_OoR22
> Days_OoR4
> Days_OoR3
> Days_OoR2
> Days_OoR18
> Days_OoR2
>=20
> (responses go from 1 to 30 and have missing data)
>=20
> I want to remove the text and just have the number
> DoR_Tm1_Visit
> 22
> 4
> 3
> 2
> 18
> 2
>=20
> I have used this syntax:
>=20
> string DoR_Tm1_Visit _Num(a20).
> COMPUTE DoR_Tm1_Visit_Num =3D NUMBER(SUBSTR(DoR_Tm1_Visit,9),f2).
> EXECUTE .
>=20
> Which works, but it is clunky and coming up with an error. Also it is tak=
ing half a day to run for all the variables and my huge data set
>=20
>=20
> Could do with some help on this one!
>  thanks

It was not easy to see in your original post, but notice in the quoted line=
s above that there is a blank space between Visit and _Num in your STRING c=
ommand.  That is what is causing the error message you are seeing, I suspec=
t. =20

But if you remove the blank space, you'll just get another error, because y=
our COMPUTE command will then be attempting to assign a numeric value to a =
string variable.  So just get rid of the STRING command, and everything sho=
uld work fine. =20

It will work fine for that particular variable, at least.  But you said you=
 have 360 variables.  Is the to-be-removed letter string the same in all of=
 them?  Or can it vary, depending on the variable?  If it can vary, you wan=
t some kind of looping solution.  E.g., here is a simple DO-REPEAT solution=
 for 3 variables where the to-be-removed string is constant within each var=
iable, but varies across variables.

DATA LIST FIXED / V1 1-10 (A) V2 11-17 (A) V3 18-27 (A) .
BEGIN DATA
Days_OoR22 Junk22 Rubbish22
Days_OoR4  Junk4  Rubbish4
Days_OoR3  Junk3  Rubbish3
Days_OoR2  Junk2  Rubbish2
Days_OoR18 Junk18 Rubbish18
Days_OoR2  Junk2  Rubbish2
END DATA.

DO REPEAT V =3D V1 to V3 /
 Junk =3D "Days_OoR" "Junk" "Rubbish".
- COMPUTE V =3D REPLACE(V,Junk,"").
END REPEAT.
* Now change the variable type to numeric.
ALTER TYPE V1 to V3 (F2.0).
LIST.

OUTPUT:

V1 V2 V3=20
=20
22 22 22=20
 4  4  4=20
 3  3  3=20
 2  2  2=20
18 18 18=20
 2  2  2

HTH.
0
Bruce
10/17/2016 10:43:22 PM
That code can't possibly work.
You declare DoR_Tm1_Visit _Num as STRING and then hammer at it with the NUMBER function.
Riddle me this:
1. What error message do you receive?

2. The notion 'huge data set' is very subjective.
How many cases do you actually have?

3. Why not run something USEFUL/INFORMATIVE rather than EXECUTE?  

4. Are the variables contiguous in the file?

Maybe this?

DO REPEAT strvar=var1 TO var360
         /numvar=new1 TO new360.
COMPUTE numvar=NUMBER(REPLACE(strvar,Days_OoR,""),F2).
END REPEAT.



On Monday, October 17, 2016 at 5:28:39 PM UTC-4, Erin wrote:
> Hello,
> I have 360 string variables that all look like this:
> 
> Variable name: DoR_Tm1_Visit
> Responses
> Days_OoR22
> Days_OoR4
> Days_OoR3
> Days_OoR2
> Days_OoR18
> Days_OoR2
> 
> (responses go from 1 to 30 and have missing data)
> 
> I want to remove the text and just have the number
> DoR_Tm1_Visit
> 22
> 4
> 3
> 2
> 18
> 2
> 
> I have used this syntax:
> 
> string DoR_Tm1_Visit _Num(a20).
> COMPUTE DoR_Tm1_Visit_Num = NUMBER(SUBSTR(DoR_Tm1_Visit,9),f2).
> EXECUTE .
> 
> Which works, but it is clunky and coming up with an error. Also it is taking half a day to run for all the variables and my huge data set
> 
> 
> Could do with some help on this one!
>  thanks

0
David
10/17/2016 10:49:38 PM
This syntax:
COMPUTE DoR_Tm1_Visit_Num = NUMBER(SUBSTR(DoR_Tm1_Visit,9),f2).
EXECUTE .

Does take out all of the unwanted text but takes a long time and results in this error:

Warning # 635
The string to be converted via the NUMBER function is of zero length.  The
result has been set to the system-missing value.
Command line: 227  Current case: 3  Current splitfile group: 1

(the same warning comes up multiple times)

Warning # 92
The limit for MXWARNS warnings in this data pass has been exceeded.  Further
warnings have been suppressed.  To change the limit use SET MXWARNS.


Tried the syntax:
string DoR_Visit_Tm1 TO DoR_Visit_Tm13 (a20). (not sure if I need to define)

DO REPEAT strvar=DoR_Tm1_Visit TO DoR_Tm13_Visit.
         /numvar=DoR_Visit_Tm1 TO DoR_Visit_Tm13. 
COMPUTE numvar=NUMBER(REPLACE(strvar,Days_OoR,""),F2). 
END REPEAT. 
It runs and has no errors if I run the define string bit.
But the data is blank (i.e. not missing, but blank)

Other qns:
2- 200,000 case
3- Open to any suggestions you may have
4- No, but I can re-order

Thank you kindly





On Tuesday, October 18, 2016 at 9:49:44 AM UTC+11, David Marso wrote:
> That code can't possibly work.
> You declare DoR_Tm1_Visit _Num as STRING and then hammer at it with the NUMBER function.
> Riddle me this:
> 1. What error message do you receive?
> 
> 2. The notion 'huge data set' is very subjective.
> How many cases do you actually have?
> 
> 3. Why not run something USEFUL/INFORMATIVE rather than EXECUTE?  
> 
> 4. Are the variables contiguous in the file?
> 
> Maybe this?
> 
> DO REPEAT strvar=var1 TO var360
>          /numvar=new1 TO new360.
> COMPUTE numvar=NUMBER(REPLACE(strvar,Days_OoR,""),F2).
> END REPEAT.
> 
> 
> 
> On Monday, October 17, 2016 at 5:28:39 PM UTC-4, Erin wrote:
> > Hello,
> > I have 360 string variables that all look like this:
> > 
> > Variable name: DoR_Tm1_Visit
> > Responses
> > Days_OoR22
> > Days_OoR4
> > Days_OoR3
> > Days_OoR2
> > Days_OoR18
> > Days_OoR2
> > 
> > (responses go from 1 to 30 and have missing data)
> > 
> > I want to remove the text and just have the number
> > DoR_Tm1_Visit
> > 22
> > 4
> > 3
> > 2
> > 18
> > 2
> > 
> > I have used this syntax:
> > 
> > string DoR_Tm1_Visit _Num(a20).
> > COMPUTE DoR_Tm1_Visit_Num = NUMBER(SUBSTR(DoR_Tm1_Visit,9),f2).
> > EXECUTE .
> > 
> > Which works, but it is clunky and coming up with an error. Also it is taking half a day to run for all the variables and my huge data set
> > 
> > 
> > Could do with some help on this one!
> >  thanks

0
erin
10/18/2016 1:36:01 AM
Thank you, nice spotting. 

The to-be-removed letter string varies:

DATA LIST FIXED / V1 1-13 (A) V2 14-27 (A) V3 28-41 (A) . 
BEGIN DATA 
Days_OoR1	Study_Grp1	Study1
Days_OoR2	Study_Grp2	Study2
Days_OoR3	Study_Grp3	Study3
Days_OoR4	Study_Grp4	Study4
Days_OoR5	Study_Grp5	Study5
Days_OoR6	Study_Grp6	Study6
Days_OoR7	Study_Grp7	Study7
Days_OoR8	Study_Grp8	Study8
Days_OoR9	Study_Grp9	Study9
Days_OoR10	Study_Grp10	Study10
Days_OoR11	Study_Grp11	Study11
Days_OoR12	Study_Grp12	Study12
Days_OoR13	Study_Grp13	Study13
Days_OoR14	Study_Grp14	Study14
Days_OoR15	Study_Grp15	Study15
Days_OoR16	Study_Grp16	Study16
Days_OoR17	Study_Grp17	Study17
Days_OoR18	Study_Grp18	Study18
Days_OoR19	Study_Grp19	Study19
Days_OoR20	Study_Grp20	Study20
Days_OoR21	Study_Grp21	Study21
Days_OoR22	Study_Grp22	Study22
Days_OoR23	Study_Grp23	Study23
Days_OoR24	Study_Grp24	Study24
Days_OoR25	Study_Grp25	Study25
Days_OoR26	Study_Grp26	Study26
Days_OoR27	Study_Grp27	Study27
Days_OoR28	Study_Grp28	Study28
Days_OoR29	Study_Grp29	Study29
Days_OoR30	Study_Grp30	Study30
END DATA. 

DO REPEAT V = V1 to V3 / 
 Junk = "Days_OoR" "Study_Grp" "Study". 
COMPUTE V = REPLACE(V,Junk,""). 
END REPEAT.

This also runs, but comes back with blanks for all responses for all cases

This is not very familiar to me- so it may be my syntax interpretations...
0
erin
10/18/2016 2:34:13 AM
Ok, seems that error is replated to defining the string variable...

So if I run this on its own:
DO REPEAT strvar=DoR_Tm1_Visit TO DoR_Tm13_Visit 
         /numvar=DoR_Tm1_Visit TO DoR_Tm13_Visit. 
COMPUTE numvar=NUMBER(REPLACE(strvar,Days_OoR,""),F2). 
END REPEAT. 
 

This error comes up

>Error # 4285 in column 38.  Text: Days_OoR 
>Incorrect variable name: either the name is more than 64 characters, or it is 
>not defined by a previous command. 
>Execution of this command stops.









On Tuesday, October 18, 2016 at 9:49:44 AM UTC+11, David Marso wrote:
> That code can't possibly work.
> You declare DoR_Tm1_Visit _Num as STRING and then hammer at it with the NUMBER function.
> Riddle me this:
> 1. What error message do you receive?
> 
> 2. The notion 'huge data set' is very subjective.
> How many cases do you actually have?
> 
> 3. Why not run something USEFUL/INFORMATIVE rather than EXECUTE?  
> 
> 4. Are the variables contiguous in the file?
> 
> Maybe this?
> 
> DO REPEAT strvar=var1 TO var360
>          /numvar=new1 TO new360.
> COMPUTE numvar=NUMBER(REPLACE(strvar,Days_OoR,""),F2).
> END REPEAT.
> 
> 
> 
> On Monday, October 17, 2016 at 5:28:39 PM UTC-4, Erin wrote:
> > Hello,
> > I have 360 string variables that all look like this:
> > 
> > Variable name: DoR_Tm1_Visit
> > Responses
> > Days_OoR22
> > Days_OoR4
> > Days_OoR3
> > Days_OoR2
> > Days_OoR18
> > Days_OoR2
> > 
> > (responses go from 1 to 30 and have missing data)
> > 
> > I want to remove the text and just have the number
> > DoR_Tm1_Visit
> > 22
> > 4
> > 3
> > 2
> > 18
> > 2
> > 
> > I have used this syntax:
> > 
> > string DoR_Tm1_Visit _Num(a20).
> > COMPUTE DoR_Tm1_Visit_Num = NUMBER(SUBSTR(DoR_Tm1_Visit,9),f2).
> > EXECUTE .
> > 
> > Which works, but it is clunky and coming up with an error. Also it is taking half a day to run for all the variables and my huge data set
> > 
> > 
> > Could do with some help on this one!
> >  thanks

0
erin
10/18/2016 3:46:16 AM
Oops my bad:
Days_OoR should have quotes.

COMPUTE numvar=NUMBER(REPLACE(strvar,"Days_OoR",""),F2). 


On Monday, October 17, 2016 at 11:46:32 PM UTC-4, erin.ps...@gmail.com wrote:
> Ok, seems that error is replated to defining the string variable...
> 
> So if I run this on its own:
> DO REPEAT strvar=DoR_Tm1_Visit TO DoR_Tm13_Visit 
>          /numvar=DoR_Tm1_Visit TO DoR_Tm13_Visit. 
> COMPUTE numvar=NUMBER(REPLACE(strvar,Days_OoR,""),F2). 
> END REPEAT. 
>  
> 
> This error comes up
> 
> >Error # 4285 in column 38.  Text: Days_OoR 
> >Incorrect variable name: either the name is more than 64 characters, or it is 
> >not defined by a previous command. 
> >Execution of this command stops.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On Tuesday, October 18, 2016 at 9:49:44 AM UTC+11, David Marso wrote:
> > That code can't possibly work.
> > You declare DoR_Tm1_Visit _Num as STRING and then hammer at it with the NUMBER function.
> > Riddle me this:
> > 1. What error message do you receive?
> > 
> > 2. The notion 'huge data set' is very subjective.
> > How many cases do you actually have?
> > 
> > 3. Why not run something USEFUL/INFORMATIVE rather than EXECUTE?  
> > 
> > 4. Are the variables contiguous in the file?
> > 
> > Maybe this?
> > 
> > DO REPEAT strvar=var1 TO var360
> >          /numvar=new1 TO new360.
> > COMPUTE numvar=NUMBER(REPLACE(strvar,Days_OoR,""),F2).
> > END REPEAT.
> > 
> > 
> > 
> > On Monday, October 17, 2016 at 5:28:39 PM UTC-4, Erin wrote:
> > > Hello,
> > > I have 360 string variables that all look like this:
> > > 
> > > Variable name: DoR_Tm1_Visit
> > > Responses
> > > Days_OoR22
> > > Days_OoR4
> > > Days_OoR3
> > > Days_OoR2
> > > Days_OoR18
> > > Days_OoR2
> > > 
> > > (responses go from 1 to 30 and have missing data)
> > > 
> > > I want to remove the text and just have the number
> > > DoR_Tm1_Visit
> > > 22
> > > 4
> > > 3
> > > 2
> > > 18
> > > 2
> > > 
> > > I have used this syntax:
> > > 
> > > string DoR_Tm1_Visit _Num(a20).
> > > COMPUTE DoR_Tm1_Visit_Num = NUMBER(SUBSTR(DoR_Tm1_Visit,9),f2).
> > > EXECUTE .
> > > 
> > > Which works, but it is clunky and coming up with an error. Also it is taking half a day to run for all the variables and my huge data set
> > > 
> > > 
> > > Could do with some help on this one!
> > >  thanks

0
David
10/18/2016 5:26:21 AM
On Monday, October 17, 2016 at 10:34:15 PM UTC-4, erin.ps...@gmail.com wrote:
> Thank you, nice spotting. 
> 
> The to-be-removed letter string varies:
> 
> DATA LIST FIXED / V1 1-13 (A) V2 14-27 (A) V3 28-41 (A) . 
> BEGIN DATA 
> Days_OoR1	Study_Grp1	Study1
> Days_OoR2	Study_Grp2	Study2
> Days_OoR3	Study_Grp3	Study3
> Days_OoR4	Study_Grp4	Study4
> Days_OoR5	Study_Grp5	Study5
> Days_OoR6	Study_Grp6	Study6
> Days_OoR7	Study_Grp7	Study7
> Days_OoR8	Study_Grp8	Study8
> Days_OoR9	Study_Grp9	Study9
> Days_OoR10	Study_Grp10	Study10
> Days_OoR11	Study_Grp11	Study11
> Days_OoR12	Study_Grp12	Study12
> Days_OoR13	Study_Grp13	Study13
> Days_OoR14	Study_Grp14	Study14
> Days_OoR15	Study_Grp15	Study15
> Days_OoR16	Study_Grp16	Study16
> Days_OoR17	Study_Grp17	Study17
> Days_OoR18	Study_Grp18	Study18
> Days_OoR19	Study_Grp19	Study19
> Days_OoR20	Study_Grp20	Study20
> Days_OoR21	Study_Grp21	Study21
> Days_OoR22	Study_Grp22	Study22
> Days_OoR23	Study_Grp23	Study23
> Days_OoR24	Study_Grp24	Study24
> Days_OoR25	Study_Grp25	Study25
> Days_OoR26	Study_Grp26	Study26
> Days_OoR27	Study_Grp27	Study27
> Days_OoR28	Study_Grp28	Study28
> Days_OoR29	Study_Grp29	Study29
> Days_OoR30	Study_Grp30	Study30
> END DATA. 
> 
> DO REPEAT V = V1 to V3 / 
>  Junk = "Days_OoR" "Study_Grp" "Study". 
> COMPUTE V = REPLACE(V,Junk,""). 
> END REPEAT.
> 
> This also runs, but comes back with blanks for all responses for all cases
> 
> This is not very familiar to me- so it may be my syntax interpretations...

Try this:

DATA LIST FIXED / V1 1-13 (A) V2 14-28 (A) V3 35-42 (A) .
BEGIN DATA
Days_OoR1        Study_Grp1        Study1
Days_OoR2        Study_Grp2        Study2
Days_OoR3        Study_Grp3        Study3
Days_OoR4        Study_Grp4        Study4
Days_OoR5        Study_Grp5        Study5
Days_OoR6        Study_Grp6        Study6
Days_OoR7        Study_Grp7        Study7
Days_OoR8        Study_Grp8        Study8
Days_OoR9        Study_Grp9        Study9
Days_OoR10       Study_Grp10       Study10
Days_OoR11       Study_Grp11       Study11
Days_OoR12       Study_Grp12       Study12
Days_OoR13       Study_Grp13       Study13
Days_OoR14       Study_Grp14       Study14
Days_OoR15       Study_Grp15       Study15
Days_OoR16       Study_Grp16       Study16
Days_OoR17       Study_Grp17       Study17
Days_OoR18       Study_Grp18       Study18
Days_OoR19       Study_Grp19       Study19
Days_OoR20       Study_Grp20       Study20
Days_OoR21       Study_Grp21       Study21
Days_OoR22       Study_Grp22       Study22
Days_OoR23       Study_Grp23       Study23
Days_OoR24       Study_Grp24       Study24
Days_OoR25       Study_Grp25       Study25
Days_OoR26       Study_Grp26       Study26
Days_OoR27       Study_Grp27       Study27
Days_OoR28       Study_Grp28       Study28
Days_OoR29       Study_Grp29       Study29
Days_OoR30       Study_Grp30       Study30
END DATA.

DO REPEAT V = V1 to V3 /
 Junk = "Days_OoR" "Study_Grp" "Study".
COMPUTE V = REPLACE(V,Junk,"").
END REPEAT. 
****************************.
ALTER TYPE V1 to V3 (F5.0).
DESCRIPTIVES V1 to V3.
****************************.

0
Bruce
10/18/2016 12:32:08 PM
On Monday, October 17, 2016 at 10:34:15 PM UTC-4, erin.ps...@gmail.com wrot=
e:
> Thank you, nice spotting.=20
>=20
> The to-be-removed letter string varies:
>=20
> DATA LIST FIXED / V1 1-13 (A) V2 14-27 (A) V3 28-41 (A) .=20
> BEGIN DATA=20
> Days_OoR1	Study_Grp1	Study1
> Days_OoR2	Study_Grp2	Study2
> Days_OoR3	Study_Grp3	Study3
> Days_OoR4	Study_Grp4	Study4
> Days_OoR5	Study_Grp5	Study5
> Days_OoR6	Study_Grp6	Study6
> Days_OoR7	Study_Grp7	Study7
> Days_OoR8	Study_Grp8	Study8
> Days_OoR9	Study_Grp9	Study9
> Days_OoR10	Study_Grp10	Study10
> Days_OoR11	Study_Grp11	Study11
> Days_OoR12	Study_Grp12	Study12
> Days_OoR13	Study_Grp13	Study13
> Days_OoR14	Study_Grp14	Study14
> Days_OoR15	Study_Grp15	Study15
> Days_OoR16	Study_Grp16	Study16
> Days_OoR17	Study_Grp17	Study17
> Days_OoR18	Study_Grp18	Study18
> Days_OoR19	Study_Grp19	Study19
> Days_OoR20	Study_Grp20	Study20
> Days_OoR21	Study_Grp21	Study21
> Days_OoR22	Study_Grp22	Study22
> Days_OoR23	Study_Grp23	Study23
> Days_OoR24	Study_Grp24	Study24
> Days_OoR25	Study_Grp25	Study25
> Days_OoR26	Study_Grp26	Study26
> Days_OoR27	Study_Grp27	Study27
> Days_OoR28	Study_Grp28	Study28
> Days_OoR29	Study_Grp29	Study29
> Days_OoR30	Study_Grp30	Study30
> END DATA.=20
>=20
> DO REPEAT V =3D V1 to V3 /=20
>  Junk =3D "Days_OoR" "Study_Grp" "Study".=20
> COMPUTE V =3D REPLACE(V,Junk,"").=20
> END REPEAT.
>=20
> This also runs, but comes back with blanks for all responses for all case=
s
>=20
> This is not very familiar to me- so it may be my syntax interpretations..=
..

Given that you have 360 variables, the DO-REPEAT method I suggested earlier=
 is not ideal, because it requires you to list all 360 of the to-be-removed=
 strings.  Here's a more general approach that scales up much more easily. =
 It uses a "trick" shown on Andy Wheeler's website:

https://andrewpwheeler.wordpress.com/2013/07/09/quick-spss-tip-cleaning-up-=
irregular-characters-in-strings/

* Read in some sample data.
DATA LIST FIXED / V1 1-13 (A) V2 14-28 (A) V3 35-42 (A) .
BEGIN DATA
Days_OoR1        Study_Grp1        Study1
Days_OoR2        Study_Grp2        Study2
Days_OoR3        Study_Grp3        Study3
Days_OoR4        Study_Grp4        Study4
Days_OoR5        Study_Grp5        Study5
Days_OoR6        Study_Grp6        Study6
Days_OoR7        Study_Grp7        Study7
Days_OoR8        Study_Grp8        Study8
Days_OoR9        Study_Grp9        Study9
Days_OoR10       Study_Grp10       Study10
Days_OoR11       Study_Grp11       Study11
Days_OoR12       Study_Grp12       Study12
Days_OoR13       Study_Grp13       Study13
Days_OoR14       Study_Grp14       Study14
Days_OoR15       Study_Grp15       Study15
Days_OoR16       Study_Grp16       Study16
Days_OoR17       Study_Grp17       Study17
Days_OoR18       Study_Grp18       Study18
Days_OoR19       Study_Grp19       Study19
Days_OoR20       Study_Grp20       Study20
Days_OoR21       Study_Grp21       Study21
Days_OoR22       Study_Grp22       Study22
Days_OoR23       Study_Grp23       Study23
Days_OoR24       Study_Grp24       Study24
Days_OoR25       Study_Grp25       Study25
Days_OoR26       Study_Grp26       Study26
Days_OoR27       Study_Grp27       Study27
Days_OoR28       Study_Grp28       Study28
Days_OoR29       Study_Grp29       Study29
Days_OoR30       Study_Grp30       Study30
END DATA.

* String variables used in a vector must all=20
* be formatted to the same length.
ALTER TYPE V1 to V3 (A20).
VECTOR v =3D V1 to V3.
LOOP # =3D 1 to 3.
- LOOP #i =3D 1 to 255.
*  ASCII codes for 0 to 9 are 48 to 57.
-  IF NOT RANGE(#i,48,57) v(#) =3D REPLACE(v(#),STRING(#i,PIB),"").
- END LOOP.
END LOOP.
ALTER TYPE V1 to V3 (F5.0).
LIST.

* To scale it up for 360 contiguous variables, replace
* V1 to V3 with V1 to V360, and replace 1 to 3 with 1 to 360.

HTH.
0
Bruce
10/18/2016 9:35:52 PM
Ah yes, ok, i can get it to work with a single variable. But not across multiple.

e.g. this works:
COMPUTE DoR_Tm1_Visit_Num = NUMBER(REPLACE(DoR_Tm1_Visit,"Days_OoR",""),F2). 
EXECUTE .

and this runs without error:

DO REPEAT strvar=DoR_Tm1_Visit TO DoR_Tm13_Visit
         /numvar=DoR_Visit_Tm1 TO DoR_Visit_Tm13. 
COMPUTE numvar=NUMBER(REPLACE(strvar,"Days_OoR",""),F2). 
END REPEAT. 

But returns missing values for the new variables






On Tuesday, October 18, 2016 at 4:26:26 PM UTC+11, David Marso wrote:
> Oops my bad:
> Days_OoR should have quotes.
> 
> COMPUTE numvar=NUMBER(REPLACE(strvar,"Days_OoR",""),F2). 
> 
> 
> On Monday, October 17, 2016 at 11:46:32 PM UTC-4, erin.ps...@gmail.com wrote:
> > Ok, seems that error is replated to defining the string variable...
> > 
> > So if I run this on its own:
> > DO REPEAT strvar=DoR_Tm1_Visit TO DoR_Tm13_Visit 
> >          /numvar=DoR_Tm1_Visit TO DoR_Tm13_Visit. 
> > COMPUTE numvar=NUMBER(REPLACE(strvar,Days_OoR,""),F2). 
> > END REPEAT. 
> >  
> > 
> > This error comes up
> > 
> > >Error # 4285 in column 38.  Text: Days_OoR 
> > >Incorrect variable name: either the name is more than 64 characters, or it is 
> > >not defined by a previous command. 
> > >Execution of this command stops.
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > On Tuesday, October 18, 2016 at 9:49:44 AM UTC+11, David Marso wrote:
> > > That code can't possibly work.
> > > You declare DoR_Tm1_Visit _Num as STRING and then hammer at it with the NUMBER function.
> > > Riddle me this:
> > > 1. What error message do you receive?
> > > 
> > > 2. The notion 'huge data set' is very subjective.
> > > How many cases do you actually have?
> > > 
> > > 3. Why not run something USEFUL/INFORMATIVE rather than EXECUTE?  
> > > 
> > > 4. Are the variables contiguous in the file?
> > > 
> > > Maybe this?
> > > 
> > > DO REPEAT strvar=var1 TO var360
> > >          /numvar=new1 TO new360.
> > > COMPUTE numvar=NUMBER(REPLACE(strvar,Days_OoR,""),F2).
> > > END REPEAT.
> > > 
> > > 
> > > 
> > > On Monday, October 17, 2016 at 5:28:39 PM UTC-4, Erin wrote:
> > > > Hello,
> > > > I have 360 string variables that all look like this:
> > > > 
> > > > Variable name: DoR_Tm1_Visit
> > > > Responses
> > > > Days_OoR22
> > > > Days_OoR4
> > > > Days_OoR3
> > > > Days_OoR2
> > > > Days_OoR18
> > > > Days_OoR2
> > > > 
> > > > (responses go from 1 to 30 and have missing data)
> > > > 
> > > > I want to remove the text and just have the number
> > > > DoR_Tm1_Visit
> > > > 22
> > > > 4
> > > > 3
> > > > 2
> > > > 18
> > > > 2
> > > > 
> > > > I have used this syntax:
> > > > 
> > > > string DoR_Tm1_Visit _Num(a20).
> > > > COMPUTE DoR_Tm1_Visit_Num = NUMBER(SUBSTR(DoR_Tm1_Visit,9),f2).
> > > > EXECUTE .
> > > > 
> > > > Which works, but it is clunky and coming up with an error. Also it is taking half a day to run for all the variables and my huge data set
> > > > 
> > > > 
> > > > Could do with some help on this one!
> > > >  thanks

0
erin
10/19/2016 9:38:00 PM
This is so complex! So this one runs without error too.
But doesn't change the data...which I assume it is meant to do?



On Wednesday, October 19, 2016 at 8:35:54 AM UTC+11, Bruce Weaver wrote:
> On Monday, October 17, 2016 at 10:34:15 PM UTC-4, erin.ps...@gmail.com wr=
ote:
> > Thank you, nice spotting.=20
> >=20
> > The to-be-removed letter string varies:
> >=20
> > DATA LIST FIXED / V1 1-13 (A) V2 14-27 (A) V3 28-41 (A) .=20
> > BEGIN DATA=20
> > Days_OoR1	Study_Grp1	Study1
> > Days_OoR2	Study_Grp2	Study2
> > Days_OoR3	Study_Grp3	Study3
> > Days_OoR4	Study_Grp4	Study4
> > Days_OoR5	Study_Grp5	Study5
> > Days_OoR6	Study_Grp6	Study6
> > Days_OoR7	Study_Grp7	Study7
> > Days_OoR8	Study_Grp8	Study8
> > Days_OoR9	Study_Grp9	Study9
> > Days_OoR10	Study_Grp10	Study10
> > Days_OoR11	Study_Grp11	Study11
> > Days_OoR12	Study_Grp12	Study12
> > Days_OoR13	Study_Grp13	Study13
> > Days_OoR14	Study_Grp14	Study14
> > Days_OoR15	Study_Grp15	Study15
> > Days_OoR16	Study_Grp16	Study16
> > Days_OoR17	Study_Grp17	Study17
> > Days_OoR18	Study_Grp18	Study18
> > Days_OoR19	Study_Grp19	Study19
> > Days_OoR20	Study_Grp20	Study20
> > Days_OoR21	Study_Grp21	Study21
> > Days_OoR22	Study_Grp22	Study22
> > Days_OoR23	Study_Grp23	Study23
> > Days_OoR24	Study_Grp24	Study24
> > Days_OoR25	Study_Grp25	Study25
> > Days_OoR26	Study_Grp26	Study26
> > Days_OoR27	Study_Grp27	Study27
> > Days_OoR28	Study_Grp28	Study28
> > Days_OoR29	Study_Grp29	Study29
> > Days_OoR30	Study_Grp30	Study30
> > END DATA.=20
> >=20
> > DO REPEAT V =3D V1 to V3 /=20
> >  Junk =3D "Days_OoR" "Study_Grp" "Study".=20
> > COMPUTE V =3D REPLACE(V,Junk,"").=20
> > END REPEAT.
> >=20
> > This also runs, but comes back with blanks for all responses for all ca=
ses
> >=20
> > This is not very familiar to me- so it may be my syntax interpretations=
....
>=20
> Given that you have 360 variables, the DO-REPEAT method I suggested earli=
er is not ideal, because it requires you to list all 360 of the to-be-remov=
ed strings.  Here's a more general approach that scales up much more easily=
..  It uses a "trick" shown on Andy Wheeler's website:
>=20
> https://andrewpwheeler.wordpress.com/2013/07/09/quick-spss-tip-cleaning-u=
p-irregular-characters-in-strings/
>=20
> * Read in some sample data.
> DATA LIST FIXED / V1 1-13 (A) V2 14-28 (A) V3 35-42 (A) .
> BEGIN DATA
> Days_OoR1        Study_Grp1        Study1
> Days_OoR2        Study_Grp2        Study2
> Days_OoR3        Study_Grp3        Study3
> Days_OoR4        Study_Grp4        Study4
> Days_OoR5        Study_Grp5        Study5
> Days_OoR6        Study_Grp6        Study6
> Days_OoR7        Study_Grp7        Study7
> Days_OoR8        Study_Grp8        Study8
> Days_OoR9        Study_Grp9        Study9
> Days_OoR10       Study_Grp10       Study10
> Days_OoR11       Study_Grp11       Study11
> Days_OoR12       Study_Grp12       Study12
> Days_OoR13       Study_Grp13       Study13
> Days_OoR14       Study_Grp14       Study14
> Days_OoR15       Study_Grp15       Study15
> Days_OoR16       Study_Grp16       Study16
> Days_OoR17       Study_Grp17       Study17
> Days_OoR18       Study_Grp18       Study18
> Days_OoR19       Study_Grp19       Study19
> Days_OoR20       Study_Grp20       Study20
> Days_OoR21       Study_Grp21       Study21
> Days_OoR22       Study_Grp22       Study22
> Days_OoR23       Study_Grp23       Study23
> Days_OoR24       Study_Grp24       Study24
> Days_OoR25       Study_Grp25       Study25
> Days_OoR26       Study_Grp26       Study26
> Days_OoR27       Study_Grp27       Study27
> Days_OoR28       Study_Grp28       Study28
> Days_OoR29       Study_Grp29       Study29
> Days_OoR30       Study_Grp30       Study30
> END DATA.
>=20
> * String variables used in a vector must all=20
> * be formatted to the same length.
> ALTER TYPE V1 to V3 (A20).
> VECTOR v =3D V1 to V3.
> LOOP # =3D 1 to 3.
> - LOOP #i =3D 1 to 255.
> *  ASCII codes for 0 to 9 are 48 to 57.
> -  IF NOT RANGE(#i,48,57) v(#) =3D REPLACE(v(#),STRING(#i,PIB),"").
> - END LOOP.
> END LOOP.
> ALTER TYPE V1 to V3 (F5.0).
> LIST.
>=20
> * To scale it up for 360 contiguous variables, replace
> * V1 to V3 with V1 to V360, and replace 1 to 3 with 1 to 360.
>=20
> HTH.
0
erin
10/19/2016 9:42:30 PM
On Wednesday, October 19, 2016 at 5:42:32 PM UTC-4, erin.ps...@gmail.com wr=
ote:
> This is so complex! So this one runs without error too.
> But doesn't change the data...which I assume it is meant to do?

Did you run all of the syntax, or only as far as the END LOOP line?  If I s=
top there, and look at the data editor, the original values are still displ=
ayed--but I can also see "Transformations Pending" in the Case Counter area=
 at the bottom of the data editor window.  This means that you need an EXEC=
UTE, or some other command that causes a pass through the data.  It turns o=
ut that the ALTER TYPE command on the next line accomplishes that. =20


Here is my output from the final LIST command:

   V1    V2    V3=20
=20
    1     1     1=20
    2     2     2=20
    3     3     3=20
    4     4     4=20
    5     5     5=20
    6     6     6=20
    7     7     7=20
    8     8     8=20
    9     9     9=20
   10    10    10=20
   11    11    11=20
   12    12    12=20
   13    13    13=20
   14    14    14=20
   15    15    15=20
   16    16    16=20
   17    17    17=20
   18    18    18=20
   19    19    19=20
   20    20    20=20
   21    21    21=20
   22    22    22=20
   23    23    23=20
   24    24    24=20
   25    25    25=20
   26    26    26=20
   27    27    27=20
   28    28    28=20
   29    29    29=20
   30    30    30=20
=20
Number of cases read:  30    Number of cases listed:  30


And here is the complete syntax again, just in case there was some issue wi=
th parts of it being hidden in the Google Groups interface.


* Read in some sample data.
DATA LIST FIXED / V1 1-13 (A) V2 14-28 (A) V3 35-42 (A) .
BEGIN DATA
Days_OoR1        Study_Grp1        Study1
Days_OoR2        Study_Grp2        Study2
Days_OoR3        Study_Grp3        Study3
Days_OoR4        Study_Grp4        Study4
Days_OoR5        Study_Grp5        Study5
Days_OoR6        Study_Grp6        Study6
Days_OoR7        Study_Grp7        Study7
Days_OoR8        Study_Grp8        Study8
Days_OoR9        Study_Grp9        Study9
Days_OoR10       Study_Grp10       Study10
Days_OoR11       Study_Grp11       Study11
Days_OoR12       Study_Grp12       Study12
Days_OoR13       Study_Grp13       Study13
Days_OoR14       Study_Grp14       Study14
Days_OoR15       Study_Grp15       Study15
Days_OoR16       Study_Grp16       Study16
Days_OoR17       Study_Grp17       Study17
Days_OoR18       Study_Grp18       Study18
Days_OoR19       Study_Grp19       Study19
Days_OoR20       Study_Grp20       Study20
Days_OoR21       Study_Grp21       Study21
Days_OoR22       Study_Grp22       Study22
Days_OoR23       Study_Grp23       Study23
Days_OoR24       Study_Grp24       Study24
Days_OoR25       Study_Grp25       Study25
Days_OoR26       Study_Grp26       Study26
Days_OoR27       Study_Grp27       Study27
Days_OoR28       Study_Grp28       Study28
Days_OoR29       Study_Grp29       Study29
Days_OoR30       Study_Grp30       Study30
END DATA.

* String variables used in a vector must all=20
* be formatted to the same length.
ALTER TYPE V1 to V3 (A20).
VECTOR v =3D V1 to V3.
LOOP # =3D 1 to 3.
- LOOP #i =3D 1 to 255.
*  ASCII codes for 0 to 9 are 48 to 57.
-  IF NOT RANGE(#i,48,57) v(#) =3D REPLACE(v(#),STRING(#i,PIB),"").
- END LOOP.
END LOOP.
ALTER TYPE V1 to V3 (F5.0).
LIST.

* To scale it up for 360 contiguous variables, replace
* V1 to V3 with V1 to V360, and replace 1 to 3 with 1 to 360.

HTH.
0
Bruce
10/20/2016 12:37:22 PM
Reply: