Tokenizer identifier in Haskell -
i'm writing small program identify each input tokens operator/parenthesis/int.
however, encountered problem stating
not in scope: data constructor `integer'
here's have far (data.char
defines isdigit
, nothing else)
import data.char (isdigit) data token = tplus | ttimes | tparenleft | tparenright | tnumber integer | terror deriving (show, eq) tokenize :: string -> [token] tokenize [] = [] tokenize (c:cs) | c == '+' = tplus : tokenize cs | c == '*' = ttimes : tokenize cs | c == '(' = tparenleft : tokenize cs | c == ')' = tparenright : tokenize cs | isdigit c = tnumber integer (read c) : tokenize cs | otherwise = terror : tokenize cs
some example expected output:
*main> tokenize "( 1 + 2 )"
should give
[tparenleft,tnumber 1,tplus,tnumber 2,tparenright]
and
*main> tokenize "abc"
should expect terror
, i'm getting
[terror,terror,terror]
i'd appreciate if shed light on these 2 issues.
for not in scope: data constructor 'integer'
part, problem have integer
in line
isdigit c = tnumber integer (read c) : tokenize cs
which should be
isdigit c = tnumber (read [c]) : tokenize cs
the [c]
part needed because read
has type read :: read => string -> a
, , c
char
, [c]
string
containing char c
.
tokenize "abc"
returning [terror, terror, terror]
because of error treatment policy:
| otherwise = terror : tokenize cs
this leads to:
tokenize "abc" -- c = 'a', cs = "bc" terror : tokenize "bc" terror : (terror : tokenize "c") terror : terror : terror : [] [terror, terror, terror]
if want group of errors in single terror
, should drop incorrect input
| otherwise = terror : (dropwhile (\o -> o == terror) (tokenize cs))
Comments
Post a Comment