iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: http://github.com/mknz/mirusan
GitHub - mknz/mirusan: A PDF collection reader with built-in full-text search engine
Skip to content
This repository has been archived by the owner on Aug 20, 2019. It is now read-only.
/ mirusan Public archive

A PDF collection reader with built-in full-text search engine

License

Notifications You must be signed in to change notification settings

mknz/mirusan

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mirusan

Build status: travis-ci Build status: appveyor

mirusan_logo.png

screenshot.png

A PDF collection reader with built-in full text search engine

Written in Python / Electron / Elm / Javascript

Features

  • Simple UI

  • Local database (You have controll 100% of your data)

  • Easy installation (No need to install external databases)

  • Multiplatform (Linux, Mac, Windows)

Installation

Prerequisites

Instructions

git clone https://github.com/mknz/mirusan.git

cd ./mirusan
cd ./search
pip install -r requirements.txt

cd ../electron
npm install
npm run compile

npm start

Language support

Mirusan automatically detects input language using Google's language-detection. Tokenizer or analyzer for indexing is chosen according to the detected language.

For following languages, Whoosh's built-in LanguageAnalayzer or StandardAnalyzer (for English) is used.

(though currently it does not work properly for Arabic.)

Arabic
Danish
Dutch
English
Finnish
French
German
Hungarian
Italian
Norwegian
Portuguese
Romanian
Russian
Spanish
Swedish
Turkish

For other languages, N-gram tokenizer (minsize=1, maxsize=2) is used.

License

GPLv3

Acknowledgements

Whoosh (Pure Python search engine library)

pdf.js

Electron

Photon

Elm

elm-electron