mirror of
https://github.com/emilk/egui.git
synced 2026-06-26 22:53:14 -04:00
**Perf: Optimize text selection and navigation performance for large
documents**
#### **Summary**
This PR significantly improves the performance of text selection
(double-clicking) and cursor navigation within `TextEdit` and `Label`
widgets, particularly when handling large documents (e.g., 1MB+ or
logs). It eliminates several $O(N^2)$ bottlenecks and unnecessary memory
allocations in `text_cursor_state.rs`.
#### **Problems Identified**
1. **$O(N^2)$ Word Boundary Scanning:** In
`next_word_boundary_char_index`, `char_index_from_byte_index` was called
repeatedly inside a loop. This caused the entire document to be scanned
from the beginning for every word found, leading to quadratic time
complexity.
2. **Heavy String Allocations:** `ccursor_previous_word` used
`collect::<String>()` and `rev()` to search backwards, causing a full
copy and memory allocation of the text (or line) every time the user
moved the cursor or double-clicked.
3. **Inefficient Line Start Finding:** `find_line_start` performed
global character counts (`text.chars().count()`) and global skips, which
is very slow for large files.
4. **Global Search Scope:** `select_word_at` was performing word
boundary searches across the entire document even for simple
double-click actions.
#### **Key Changes & Optimizations**
1. **Line-Scoped Selection:** Updated `select_word_at` to first identify
the current line and then perform word boundary searches within that
local scope. This reduces the search space from millions of characters
to hundreds.
2. **Linear Time ($O(N)$) Boundary Search:** Refactored
`next_word_boundary_char_index` to use a running cumulative character
counter. This ensures the text is scanned only once.
3. **Zero-Allocation Backwards Search:** Optimized
`ccursor_previous_word` to use `next_back()` on the
`DoubleEndedIterator` provided by `unicode-segmentation`. This removes
all temporary `String` allocations.
4. **Byte-Based Line Search:** Optimized `find_line_start` to use
byte-based reverse scanning (`rfind('\n')`), which is significantly
faster than counting characters from the start of the document.
#### **Performance Impact**
In my tests with large text files (over 10,000 lines / 1MB+):
- **Before:** Double-clicking a word caused a UI freeze for 2–5 seconds.
- **After:** Word selection and navigation are near-instantaneous
(0–1ms), providing a smooth "native-like" experience even in WASM
environments.