Commit graph

50 commits

Author SHA1 Message Date
Paul Beckingham
0c7e731b0d Lexer: Integrated ::commonLength
- Uses std::string::size_type for all string lengths, offsets.
- Rewrote ::isLiteral to be simpler.
- Added support for abbreviated DOM refs.
- Obeys rc.abbreviation.minimum, indirectly.
- Added tests.
2015-07-27 00:31:15 -04:00
Paul Beckingham
a9b701ae6d Lexer:: Implemented ::commonLength with offsets, for embedded parsing 2015-07-27 00:04:00 -04:00
Paul Beckingham
244c81a647 Lexer: Implemented ::commonLength for word root comparison 2015-07-26 23:52:17 -04:00
Paul Beckingham
f5792a03fb Lexer: Captures minimumMatchLength for abbreviated attribute matching 2015-07-26 22:58:02 -04:00
Paul Beckingham
d0c4326af3 Lexer: Upgraded attributes vector to a map of name to type 2015-07-26 12:22:02 -04:00
Paul Beckingham
52d2bbd11a Lexer: Implemented ::isOneOf using a std::map as input 2015-07-26 12:09:40 -04:00
Paul Beckingham
391d527328 Lexer: Added end-boundary sensitivity to ::isLiteral and ::isOneOf 2015-07-26 10:48:26 -04:00
Paul Beckingham
3e74aa51e2 Lexer: Added 'endBoundary' requirement to ::isUUID 2015-07-25 17:59:40 -04:00
Paul Beckingham
c769891b76 Lexer: Implemented ::isInteger to help parsing. 2015-07-25 17:54:55 -04:00
Paul Beckingham
37e31e8e0b Lexer: Implemented ::isOneOf, to help with parsing 2015-07-25 17:34:51 -04:00
Paul Beckingham
9394b96202 Lexer: Implemented ::isLiteral, to help with parsing 2015-07-25 17:34:01 -04:00
Paul Beckingham
fed3b815a0 Lexer: Dead code removal 2015-07-17 14:59:42 -04:00
Paul Beckingham
d0e4f4ca10 Lexer: Implemented ::decomposePattern 2015-07-11 17:09:29 -04:00
Paul Beckingham
1bef45ff47 Lexer: Added ::decomposeSubstitution and more flexible ::dequote
- ::dequote can now be given a string of valid quote characters, which defaults
  to '".
- ::decomposeSubstitution properly parses the /from/to/g construct allowing for
  escaped characters (\/).
- The 'g' at the end of a substitution is now considered to be a string of flag
  characters, which may contain 'g'. No other flag values are currently
  supported.
2015-07-11 16:40:52 -04:00
Paul Beckingham
642f378462 Lexer:: Implemented ::isHardBoundary to detect filter tokens 2015-07-11 13:12:09 -04:00
Paul Beckingham
1fed8c55f1 Lexer: Collapsed two ::isString calls into one 2015-07-06 16:40:18 -04:00
Paul Beckingham
7a6d546a0d Lexer:: Added polymorphic ::readWord for quoteѕ and unquoted strings 2015-07-06 16:37:46 -04:00
Paul Beckingham
d82da280cb Lexer: Implemented ::readWord
- Lexer::readWord is a general-purpose text parser, for finding plain words and
  quoted strings. It supports \uNNNN and U+NNNN unicode sequences, and general
  escapes, \t, \', \" etc.
2015-07-06 15:32:12 -04:00
Paul Beckingham
81599071e7 Lexer: Implemented ::decomposePair 2015-07-06 11:28:39 -04:00
Paul Beckingham
1836ac29e2 Lexer: Removed expermental code, didn't help 2015-07-04 15:03:28 -04:00
Paul Beckingham
3b99559216 Lexer: Added standalone token support
- Added default ctor.
- Added ::token method for classifying whole tokens.
- Stubbed token classifier methods.
2015-07-04 11:38:09 -04:00
Paul Beckingham
ad17ad82dd Lexer: Removed obsolete method def 2015-07-04 10:34:16 -04:00
Paul Beckingham
f33da18789 Lexer: Removed ::isList and Lexer::Type::list - not needed 2015-07-01 18:04:21 -04:00
Paul Beckingham
b090c6bccf Lexer: Removed unnecessary ::ambiguity method 2015-07-01 16:18:28 -04:00
Paul Beckingham
86ed232348 Lexer: Added ::wasQuoted to determine original quote state 2015-06-28 12:35:06 -04:00
Paul Beckingham
d9bcbdee0a Lexer: Added ::isContiguous for word-like matching 2015-06-22 21:34:57 -04:00
Paul Beckingham
f4a7c50f1a Lexer: Added ::isSet to recognize numerical sets
- A numerical set is a list of numbers: 1,2,3
  Or a range of numbers:                5-10
  Or a combination of both:             1,2,3,5-10
2015-06-19 18:28:58 -07:00
Paul Beckingham
c6dbdf87a4 Lexer: Migrated isalpha to Lexer::isAlpha 2015-04-16 23:24:17 -04:00
Paul Beckingham
237d932ff9 Lexer
- Improved ::isIdentifier, ::isUUID and ::isDOM.
2015-03-01 23:54:45 -05:00
Paul Beckingham
8791c0a921 Lexer
- Migrated old noSpaces() function into Lexer::isOneWord.
2015-02-22 18:23:26 -05:00
Paul Beckingham
745aad0d27 Lexer
- Renamed Lexer2 to Lexer, it looks good enough to assume control.
2015-02-22 18:23:03 -05:00
Paul Beckingham
0cf18f3b16 Lexer2
- Integrated Lexer2 in place of Lexer. Tests fail.
2015-02-22 13:52:14 -05:00
Paul Beckingham
6626207ad1 TW-1522
- TW-1522 Date format doesn't like hyphens (thanks to Scott Carter).
2015-01-25 14:49:02 -05:00
Paul Beckingham
b7ad091d00 Updated copyright to 2015 2015-01-01 00:00:41 -05:00
Paul Beckingham
06319711f1 Quoting
- Removed automatic dequoting by the Lexer.
- Implemented Lexer::dequote for manual control.
- Variant dequotes string values when appropriate.
- Fixed some unit tests that became wrong.
2014-11-18 00:59:52 -05:00
Paul Beckingham
38359b779a Lexer
- Exposed more primitives as static methods.
2014-11-04 22:51:25 -05:00
Paul Beckingham
68fb1136cc Lexer
- Added notes about additional lexeme types that are needed, long term.
2014-09-07 13:37:46 -04:00
Paul Beckingham
aab23692f1 Lexer
- Added a new type Lexer::typeTag.
2014-09-07 01:17:48 -04:00
Paul Beckingham
9778100d29 Lexer
- When parsing two-character operators ('or') from a string ('ordinary'), the
  lack of boundary between the 'r' and the 'd' now prevents the operator 'or'
  from being recognized.
2014-07-03 16:26:17 -04:00
Paul Beckingham
65f979cb4f Lexer
- Refactored (step 1) the ISO and Legacy date/duration parsing for lexer state
  machine breakout.
2014-06-29 09:36:27 -04:00
Paul Beckingham
008ba6ecab Lexer
- Implmented boundary detection hints.
2014-06-18 17:45:25 -04:00
Paul Beckingham
7d4e166277 Lexer
- Implemented an overload of ::token_split that preserveѕ types.
2014-06-14 13:46:10 -04:00
Paul Beckingham
2554b29041 Lexer
- Needed a shift counter, rather than a read counter, as ::token was
  lexing '-10d' into '-' and '-10d', which when evaluated is '--10d',
  which yields 10d.
2014-06-10 15:42:21 -04:00
Paul Beckingham
7598997e70 Lexer
- Implemented ::token_split, which performs a full lex, and doesn't require
  white space like ::word_split does.
- Added unit tests.
2014-05-31 13:51:10 -04:00
Paul Beckingham
0af9bbdc03 Lexer
- Renamed ::split to ::word_split, for clarity and because of the need for a
  full token split, coming next.
2014-05-31 13:48:52 -04:00
Paul Beckingham
592a3bb60f Lexer
- Lexer now makes a speculative legacy dateformat parse whenever it encounters
  a decimal digit.  This assumes that rc.dateformat begins with a numeric date
  element, which is a restriction, but not a big one.
2014-05-29 18:09:11 -04:00
Paul Beckingham
611812007a Lexer
- Implemented Lexer::word, which is just like ::token, but does not
  understand dates, durations or operators.
- Implemented Lexer::split, which uses Lexer::word.
- Added unit tests.
2014-04-23 23:19:41 -04:00
Paul Beckingham
9bfe40fac7 Lexer, Duration
- Merged libexpr code.
2014-01-02 00:55:53 -05:00
Paul Beckingham
9bf1ec2f7c Code Cleanup
- Eliminated Lexer.
2011-07-26 00:37:49 -04:00
Paul Beckingham
ed8454c202 Expressions
- Implemented sequence --> infix converter.
- Added new Lexer code.
- Added Lexer unit tests.
2011-06-06 01:46:11 -04:00