Tokenizer identifier in Haskell -
i'm writing small program identify each input tokens operator/parenthesis/int.
however, encountered problem stating
not in scope: data constructor `integer' here's have far (data.char defines isdigit, nothing else)
import data.char (isdigit) data token = tplus | ttimes | tparenleft | tparenright | tnumber integer | terror deriving (show, eq) tokenize :: string -> [token] tokenize [] = [] tokenize (c:cs) | c == '+' = tplus : tokenize cs | c == '*' = ttimes : tokenize cs | c == '(' = tparenleft : tokenize cs | c == ')' = tparenright : tokenize cs | isdigit c = tnumber integer (read c) : tokenize cs | otherwise = terror : tokenize cs some example expected output:
*main> tokenize "( 1 + 2 )" should give
[tparenleft,tnumber 1,tplus,tnumber 2,tparenright] and
*main> tokenize "abc" should expect terror, i'm getting
[terror,terror,terror] i'd appreciate if shed light on these 2 issues.
for not in scope: data constructor 'integer' part, problem have integer in line
isdigit c = tnumber integer (read c) : tokenize cs which should be
isdigit c = tnumber (read [c]) : tokenize cs the [c] part needed because read has type read :: read => string -> a, , c char, [c] string containing char c.
tokenize "abc" returning [terror, terror, terror] because of error treatment policy:
| otherwise = terror : tokenize cs this leads to:
tokenize "abc" -- c = 'a', cs = "bc" terror : tokenize "bc" terror : (terror : tokenize "c") terror : terror : terror : [] [terror, terror, terror] if want group of errors in single terror, should drop incorrect input
| otherwise = terror : (dropwhile (\o -> o == terror) (tokenize cs))
Comments
Post a Comment