Lazy updating for ascertaining block numbers in a document
From qtnode
Note: this probably isn't the most clear explanation. I'll rewrite it in a day or so after a fresh look at it.
There isn't any API function to match a cursor or text block to a line number. In fact, that isn't very well-defined anyway, given the nature of QTextDocument. Therefore the following presents one interpretation of line numbers, presuming that only QTextBlocks are inserted into the document. At the very least, the following presents one method of enumerating QTextBlocks in order. This might be especially useful if you want to have line numbers running down the side in a margin (in addition to a row/column indicator in the status bar, which by itself would have a much simpler update strategy). You'll have to decide for your application how you want it to work with things like frames and tables.
If all the fonts are the same height and you have no line wrapping, you can probably just calculate the line numbers from the amount scrolled. However, the following solution is independent of point size, line wrapping, and such.
Right now just explanations and references to code are presented. An implementation is mostly complete, and should be posted at a later time. It's a little difficult explaining without having the code available, but I'm just currently making it extra shiny :-)
You could just count from the start of the document everytime the cursor moves, or the user scrolls the view. This brute force approach on a document tens of thousands of lines long is pretty zippy on an Athlon 64 X2 4200+ with 1 gig of RAM and a mostly idle system. However, we don't have to throw away our PII's just yet ;-)
The invariant
The basic invariant is this: from the top of the document and down to a certain block, the block numbers are correct. Let's call the position of this last correct block lastCorrectBlockPosition. Beyond that the block numbers may or may not be up-to-date -- we don't care yet. We'll update those blocks if their block numbers are needed, such as if the user scrolls them into view.
Maintining the invariant
To maintain the invariant, you need to know when the document is changed. Connecting to the signal QTextDocument::contentsChange(int position, int charsRemoved, int charsAdded) will tell you this. The variables position and charsAdded tell you the range of blocks to look at to see how you need to update lastCorrectBlockPosition. If the blocks at those positions are invalid, then you just skip updating anything this time around (or depending on the implementation you set lastCorrectBlockPosition to some invalid value to indicate the document just had all its blocks removed). Note that you can find a block at given position using QTextDocument::findBlock(int pos). You may need to be mindful about when this signal is emitted if you use QTextCursor::beginEditBlock() and QTextCursor::endEditBlock() when editing the document. But it shouldn't be a problem as long as you call QTextCursor::endEditBlock() in the same function as you called QTextCursor::beginEditBlock().
Getting back to maintaining the invariant, we need to update the block numbers inclusively between the block at position (call it startBlock) and at position + charsAdded (call it endBlock). But first we need to know what block number to start counting from -- which is just the block number from before startBlock. Assign previousBlock as the block you get from startBlock.previous(). The block previousBlock either returns false for isValid() or, if valid, contains a correct block number due to our stated invariant. If previousBlock is invalid, we can just say that it has a block number of 0 (if you're counting from 1).
You can use QTextBlockUserData to store a blockNumber field.
After iterating, everything below the position of endBlock does not necessarily have a valid block number. But the invariant is maintained: from the top of the document down to the position of endBlock, the QTextBlocks have the correct block number. That is lastCorrectBlockPosition is now the position of endBlock. However, that might not be the case if you use a further optimization: if you determine that no QTextBlocks were added or removed for the signal contentsChange(int position, int charsRemoved, int charsAdded), then the position of the last correct block should still be that of the last correct block from before. Because the actual position of the last correct block would change, it would actually be better to store the last correct block instead of the position of the last correct block if you add this optimization.
Lazy update - Providing an API for block numbers
Clients most not access the block number from the QTextBlockUserData directly. Instead they can provide a starting and ending QTextBlock, and get an integer-pair back specifying the corresponding starting and ending block number. You could also return a list instead, with additional, individual data for each text block.
When the client asks for the block numbers (say inclusively between startBlock and endBlock), you have to to iterate from the block at lastCorrectBlockPosition down to endBlock, updating the block numbers along the way. However, this is not much of a penalty. After you've updated the blocks this way, you set the new value of lastCorrectBlockPosition to the maximum of the position of endBlock and the current value of lastCorrectBlockPosition. This way you're stil keeping track of what's valid.