wxpath TUI - Interactive Expression TestingΒΆ
NOTE: I highly recommended you enable caching (Ctrl+L) for faster execution, and set depth (i.e.,
url('...', depth=...)) for capped crawls to be polite to the servers you are crawling.
β¨ FeaturesΒΆ
π Top Panel - Expression EditorΒΆ
- Syntax-aware text editing
- Real-time validation feedback
- Smart bracket/quote matching
- Inline error detection
π Bottom Panel - Live Output DisplayΒΆ
- HTML Elements: Formatted with partial content display (first 300 chars)
- Dict/XPathMap: Automatically rendered as elegant tables
- Sortable columns: Click a column header to sort by that column; click again to toggle ascending/descending
- Export: Export table data to CSV or JSON (Ctrl+E or Export button)
- Error Messages: Clear validation and execution feedback
- Waiting State: Shows when expression is incomplete or invalid
- Streaming Results: Live updates as data arrives (max 10 items shown)
- Cancel Crawl: Press Escape during a run to stop the crawl; results already received stay in the table
π InstallationΒΆ
Install wxpath with TUI support:
Or install textual separately if wxpath is already installed:
π― UsageΒΆ
Launch the TUIΒΆ
KeybindingsΒΆ
| Key | Action | Description |
|---|---|---|
Ctrl+R or F5 |
Execute | Run the current expression |
Escape |
Cancel Crawl | Stop the running crawl; partial results are kept |
Ctrl+E |
Export | Export table data (CSV or JSON) |
Ctrl+C |
Clear | Clear the output panel |
Ctrl+H |
Headers | Configure HTTP headers (JSON) |
Ctrl+Shift+S |
Settings | Edit persistent crawler settings (CONCURRENCY, PER_HOST, RESPECT_ROBOTS) |
Ctrl+L |
Cache | Toggle HTTP caching on/off (SQLite for now) |
Ctrl+Shift+D |
Toggle Debug | Show or hide the debug panel |
Ctrl+Q |
Quit | Exit the application |
| Click column header | Sort | Sort table by that column; click again to toggle ascending/descending |
π Example ExpressionsΒΆ
1. Simple Text ExtractionΒΆ
Output: List of text strings2. Map Extraction (Table View)ΒΆ
url('https://quotes.toscrape.com')//div[@class='quote']/map {
'quote': .//span[@class='text']/text(),
'author': .//span[@class='author']/text(),
'tags': .//div[@class='tags']//a/text()
}
3. Link Following (Crawling)ΒΆ
url('https://quotes.toscrape.com')
///url(//a[contains(@href, '/author/')]/@href)
//h3[@class='author-title']/text()
4. HTML Element ExtractionΒΆ
Output: Partial HTML of matching elementsποΈ ArchitectureΒΆ
The TUI embodies wxpath's architectural philosophy:
βββββββββββββββββββββββββββββββββββββββ
β Textual Framework β β Modern TUI with Rich rendering
βββββββββββββββββββββββββββββββββββββββ€
β Expression Editor (TextArea) β β Real-time validation
βββββββββββββββββββββββββββββββββββββββ€
β WXPath Engine β β Async concurrent execution
βββββββββββββββββββββββββββββββββββββββ€
β Output Renderer β β Smart formatting (HTML/Table)
β β’ HTML Elements β
β β’ Dict β Table β
β β’ Error Messages β
βββββββββββββββββββββββββββββββββββββββ
Key ComponentsΒΆ
- Textual: Modern terminal UI framework with Rich rendering
- WXPath Engine: Async execution with concurrent crawling
- Reactive Validation: Live feedback as you type
- Smart Formatting: Automatic detection and formatting of result types
- Hook System: XPathMap serialization for clean dict output
π How It WorksΒΆ
Expression ValidationΒΆ
The TUI validates your expression in real-time:
- Balance Checking: Parentheses
(), brackets[], braces{} - Quote Matching: Single
'and double"quotes - Syntax Validation: Parser checks for valid wxpath syntax
- Feedback Display: Shows "Waiting" until expression is complete
Execution FlowΒΆ
User Types β Validation β Press Execute β Parse β Run Engine β Format β Display
β β β β β β β
TextArea Balance? Parser OK? AST Built HTTP Req HTML/Table Output
β
"Waiting" or "Valid"
Output FormattingΒΆ
| Input Type | Output Format | Details |
|---|---|---|
HtmlElement |
Partial HTML string | First 300 chars, escaped |
dict (single) |
Indented key-value | Pretty-printed |
[dict, dict, ...] |
Table | Columns auto-detected |
str |
Plain text | Truncated if >200 chars |
| Other | String repr | Generic fallback |
βοΈ ConfigurationΒΆ
Persistent Settings (Ctrl+Shift+S)ΒΆ
Crawler settings are saved to a config file and reused across sessions:
| Setting | Description | Default |
|---|---|---|
| CONCURRENCY | Maximum concurrent HTTP requests | 16 |
| PER_HOST | Maximum concurrent requests per host | 8 |
| RESPECT_ROBOTS | Whether to respect robots.txt | ON (true) |
- Config file:
~/.config/wxpath/tui_settings.json(or$XDG_CONFIG_HOME/wxpath/tui_settings.jsonif set). - When applied: Values are used for the next expression run after you save.
- Extending: New settings can be added by extending the schema in
src/wxpath/tui_settings.pyand using the value where the crawler/engine is created.