expat.library element length

	Bottom Previous Topic Next Topic
Register To Post

mritter0

Posted on: 2019/7/8 22:32 #1

Just popping in

I am using expat.library to parse my language syntax highlighting files. All is going well, expect I get a max of 1024 characters for the elements.

......
<Keywords>
<ReservedWords>
byte int long printf
</ReservedWords>
</Keywords>
.......

For C/C++ the line length is about 1400 characters, but I only get the first 1024.

IExpat->XML_SetCharacterDataHandler(Parser,CharacterElement);

VOID
CharacterElement(void *x, const XML_Char *t, int len)

Am I stuck or is there a way to increase the length? I would think it would be "infinite". Allocate a buffer of len size and copy it from t.

////////////////////////////////////////////

Problem 2 is:

The way I have it above does not work. This does:

<ReservedWords>byte int long printf</ReservedWords>

Why? It doesn't like leading tabs and line feeds.
If the words are not on the same line as <ReservedWords> it will not be recognized.

Edited by mritter0 on 2019/7/8 22:50:54
Edited by mritter0 on 2019/7/9 3:33:51

billyfish

Re: expat.library element length

Posted on: 2019/7/9 2:54 #2

Just popping in

@mritter0

For problem 1, if it's a sax parser, then you need to have a buffer that you use to get the data. So you have StartElement () and EndElement () calls that tell you when an element is started and finished. In between them the data handler can get called multiple times, in your case CharacterElement, and you have to append the data chunks together yourself. Does that make sense?

For problem 2, I'd need more info but you should have a lifecycle like

StartElement for Keywords
StartElement for ReservedWords
One or more calls to CharacterElement
EndElement for ReservedWords
EndElement for Keywords

mritter0

Re: expat.library element length

Posted on: 2019/7/9 3:33 #3

Just popping in

@billyfish

That makes sense, but I only get 1 call to CharacterElement() for the item I am parsing. I can tell because I put some Printf() calls in there and it only shows once.

On the other hand, it doesn't make sense. Just tell me the total length of expat's internal buffer and I copy it to my buffer of same length.

//////////////////////////////////////////////////

I wasn't totally clear on problem #2. If the keywords aren't on the same line as <ReservedWords> it will fail to recognize them.

broadblues

Re: expat.library element length

Posted on: 2019/7/9 10:44 #4

Home away from home

@mritter0

You should see Character data called more than once, so I'm not sure why you are not given the info provided.

As to the second issue, remeber that all chracter data including any leading white space is passed through, so you need to strip any newlines and tabs etc. yourself if present.

For more help you will need to share more detailed code....

Blender For OS4.x : Blues : Walker Broad

broadblues

Re: expat.library element length

Posted on: 2019/7/9 11:26 #5

Home away from home

@mritter0

You could side step this issue with better design XML structure though, your current idea of a space separated list inside a single element is not very structured. And structure is what xml is for. You have no advanteg from your xml over plain text, so you might as wel have used plain text.

For a more structure xml approacg try somethinmg more like...


<keywords>

 <entry class="type">int</entry>

.

.

.

 <entry class="function">printf</entry>

</keywords>

Construct your own scheme naturally but use one tag for each keyword and add attributes with classnames or similar to associate meta data like what colour or style to render with.

Blender For OS4.x : Blues : Walker Broad

mritter0

Re: expat.library element length

Posted on: 2019/7/9 18:34 #6

Just popping in

@broadblues

Your second post is what I am trying to get away from. It is fine for most languages, but some have hundreds or even thousands of keywords (PHP, Python, C#....). It's not a deal breaker if I *have* to, just not my first choice.

I don't give up that easy. I will figure something out.

Workbench Explorer - A better way to browse drawers

broadblues

Re: expat.library element length

Posted on: 2019/7/9 19:44 #7

Home away from home

@mritter0

With respect the more keywords the more advantage there is dealing with them in structured way, your thnking is "flawed" if you think otherwise

You can add new items very quickly with no danger of corrupting existing one for example.

I extended C/C++ syntax file for richeditor (added newlib functions to the existing standard set) The original is nearly 6000 lines long admittedly but I built the extra bits with a trivial bit of perl and this was easy brcause of the simple structred nature of the data.

Don't be lazy it will save you effort in the long run and make your resource much more rubust.

Blender For OS4.x : Blues : Walker Broad

trixie

Re: expat.library element length

Posted on: 2019/7/9 23:06 #8

Amigans Defender

@mritter0

Having read about your trouble with Expat, would it be an option for you to try out something else? There's a lightweight XML parser implemented as a link library for C which I have ported to OS4. It's quite small and fast.

Unlike Expat (which is event-driven and you have to write all handling code yourself), libroxml is a tree-based parser, so it makes it much easier to access the XML data. The API is very well documented.

Let me know if you'd like to give it a try. (I haven't released the port on OS4depot yet because I have only done a few basic tests, but so far it seems to work fine.)

The Rear Window blog

AmigaOne X5000 @ 2GHz / 4GB RAM / Radeon RX 560 / ESI Juli@ / AmigaOS 4.1 Final Edition
SAM440ep-flex @ 667MHz / 1GB RAM / Radeon 9250 / AmigaOS 4.1 Final Edition

mritter0

Re: expat.library element length

Posted on: 2019/7/9 23:34 #9

Just popping in

@trixie

I have re-worked my files to be "the old way" I was doing things, <Keyword>xxx</Keyword>, for each entry. It isn't killing me, but more setup work.

ezXML was my second choice over expat. It worked fine with the long lines....at first. It is also a tree-based parser. But it has no error checking for bad elements.

Does libroxml verify the document like expat?

But yes, I will take a look at it. I like that it can write files, too. Send it to my email.

Thanks

Workbench Explorer - A better way to browse drawers

Register To Post
	Top Previous Topic Next Topic

Currently Active Users Viewing This Thread: 1 ( 0 members and 1 Anonymous Users )