How to parsing text

How to parse text? For example, I get all text from the screen.
Then I would like to get text by each line "\n".
But I tried the following two, both don't work.
1)put ad split by "\n" into numbers_list
2)set the itemDelimiter to "\n"


Log ReadText((0,0,800,800))
set ad to ReadText((0,0,800,800))

put ad split by "\n" into numbers_list
log numbers_list

set the itemDelimiter to "\n"
put ad into numbers_list2
log numbers_list2

Comments

  • The OCR Engine is using Unicode line endings, so the correct way to split will be:
    put myvar split by numToChar(8232) into myList
    

    Hope this helps.
  • SenseTalkDougSenseTalkDoug ForumAdmin admin
    A couple of other thoughts:

    First, be aware that SenseTalk treats quoted text literally, so "\n" in SenseTalk is two characters: the backslash "\" followed by the letter "n". The only exception to this in Eggplant is the TypeText command, which translates some sequences like "\n" into special keystrokes on the SUT like Return. So your example code is actually looking for occurrences of backslash-n in the text, not for returns.

    Next, Allen is correct that text returned by the ReadText function will use the Unicode lineSeparator character between lines of text. To make this easy to work with, beginning with version 11 of Eggplant the defaultLineDelimiter is set by default to be the set of all common line endings: (CRLF, LF, CR, LineSeparator, ParagraphSeparator). So you could get what you want by using the "each line of" expression:
    put each line of ad into numbers_list
    

    Finally, Allen loses one SenseTalk style point for using numToChar(8232) instead of the predefined variable LineSeparator which has the same value but is more readable (Allen, what were you thinking?!) ;)
  • Thanks. That works.

    How to read next item in the list?
    For example, once I found "Min", I want to get the value of the following item? Because they are splited by "\n", "Min:\n100\n".

    repeat with each line in myList
    if the first word of it is "Min"
    then put next item into min
    end repeat
  • SenseTalkDougSenseTalkDoug ForumAdmin admin
    This is a case where using an iterator would come in handy, as iterators give you much more control over your progress through their values. Iterators include lists, ranges, and custom iterator objects. You can test whether a value gives you the control of an iterator by using the "is an iterator" operator:
    put (1,2,3) is an iterator -- true
    put 10..15 is an iterator -- true
    put "abc" is an iterator -- false
    
    As you can see, a text string is not an iterator so you'll first have to convert the text into a list in order to use this technique.
    put each line of sourceString into myList
    
    Once you have a source value that can act as an iterator, you can iterate over its values using "repeat with each" but you now have the added control of being able to move the iteration pointer forward or backward (or skip it directly to any point in the list) during the repeat. Complete control is given by setting the currentIndex property of the list. But in your case, all you need to do is access the nextValue property to get the next value and advance the iterator. Here's what it might look like:
    set source to {{
    Min:
    100
    Max:
    300
    }}
    put each line of source into myList
    
    repeat with each item of myList
    	put myList.currentIndex && it
    	if it begins with "Min" then put myList.nextValue into min
    end repeat
    
    put "Min is " & min
    
    The example above gives this output:

    1 Min:
    3 Max:
    4 300
    Min is 100

    So you can see that accessing the nextValue property advanced the currentIndex of the list, causing the repeat loop to proceed with item 3 on the next iteration.
  • Where to find unicode information?
    I tried to split a string by space.

    http://www.ssec.wisc.edu/~tomw/java/unicode.html
    Seems Space unicode is 9248.

    But the following statement doesn't work.
    For example, I have a string "good |", I want to get the value of "good".
    put string1 split by numToChar(9248) into list1
    put item 1 of list1
  • EggplantMattEggplantMatt ForumAdmin admin
    Space is "space". So the following should work:
    put string1 split by space into list1
    put item 1 of list1
    
    The line ending is really the only thing being returned by an eggPlant function where the Unicode value is a consideration.
  • EggplantMattEggplantMatt ForumAdmin admin
    @Doug: Don't blame Allen; I gave him the code. I didn't realize that LineSeparator included that value.
Sign In or Register to comment.