replacing "whole words" in a given string

  • Follow


Hello Everyone,

I am working with a cell array of strings, where each cell represents an equation that i need as an output for another project outside of matlab. The strings were created after building the equations with symbolic variables and converting them to strings. Now I need to change the variable names so that they are appropriate for the output. My problem is more or less that replacing strings in matlab (so far i have tried using strrep and regexrep) looks for an exact match and cant be set to look for a "whole word" match, as for example in microsoft word. I also cant replace the symbolic variables before converting to strings because the variable names i need to change to are invalid in matlab. Below is an example of what i mean.

Here are two of our variables, and the expressions they need to be converted to:

'KHSTO' ------> 'K_{HSTO}^{A}'
'T' ------> 'T^{A}'

So, the original solution was to convert to strings, and then replace, but then 'T' is found in both of the above variables. And we can not convert to the desired variables with symbolics because they are invalid. Any ideas for a solution?

thanks for any help. 
0
Reply kavon 1/12/2011 10:18:04 AM

Try John D'Errico's allwords:

http://www.mathworks.com/matlabcentral/fileexchange/27184-allwords

Description

Sentence parsing can be done one word at a time using strtok. However,
sometimes it is useful to (efficiently) extract all words into a cell
array in one function call. The function allwords.m does exactly this.

Spaces, white space (tabs), carriage returns, and punctuation
characters are all valid separator characters by default. In this
example, I had a period at the end, as well as multiple spaces between
some words.

str = 'The quick brown fox jumped over the lazy dog.';
words = allwords(str)
words =
  'The' 'quick' 'brown' 'fox' 'jumped' 'over' 'the' 'lazy' 'dog'

This utility can also work on any integer vector. The default
separators for numeric vectors are [-inf inf NaN], but you can assign
any separators you desire. Here, parse a string of integers, with only
NaN elements as the separator.

str = [1 2 4 2 inf 3 3 5 nan 4 6 5];
words = allwords(str,nan);
words{1}
ans =
     1 2 4 2 Inf 3 3 5

words{2}
ans =
     4 6 5

Finally, allwords is efficient. For example, on a random numeric
string of length 1e6, allwords parses it into over 90000 distinct
"words" in less than 0.5 seconds.

str = round(rand(1,1000000)*10);
tic
words = allwords(str,[0 10]);
toc
Elapsed time is 0.455194 seconds.

There were over 90000 different words that were extracted

numel(words)
ans =
     90310

The longest word had length 104.

max(cellfun(@numel,words))
ans =
     104
0
Reply ImageAnalyst 1/12/2011 11:11:50 AM


1 Replies
399 Views

(page loaded in 0.041 seconds)

Similiar Articles:













7/24/2012 5:51:43 AM


Reply: