page loader
show-all-works-button.pngshow-all-works-active-button.png
pdf-analyser_slider_1.png
pdf-analyser_slider_2.png
pdf-analyser_slider_3.png
pdf-analyser_slider_4.png
pdf-analyser_slider_5.png
technologies-image.pngPHPDesign

Goals

The purpose of the project is to help moderator without reviewing of pdf files to determine that this file is a magazine and take it to attribute it to a specific category. For convenient work, the user must create a list of tags for search. These tags are grouped into categories. Next pdf files are loaded and the parser counts the number of tags in the document. The result is a list with a preview (first page) and additional information (number of pages, the original title, etc.)

The user can also create filters to produce certain results. For example, the file must contain the word "tree", but does not contain the word "maple" or magazine must contain at least 20 words "fashion" and then it goes into a certain category. The user can also view a list of parsed files and if for some reason it did not get into magazines manually assign it to a specific category. And then upload the file list with the names and additional information.

Progress: at the moment the parser with drag­and­drop file upload and preserving the history of parsing, which displays a list of files with preview is ready; it also counts the number of tags and pulls the metadata from a file.

 

Solution

For the realization of this project, we use Laravel 5 PHP framework. Xpdf  C ++ library that allows us to pull out text, images, metadata. To remove protection from protected files was used Ghostscript.

The parser works pretty quickly: 50 random files from 1MB to 80MB (with and without protection) work out for about 1 minute.

Development was carried out locally for Windows, but can be adapted for Linux and MAC.

 

Technologies

Laravel 5, Хpdf, Ghostscript, PHP

 

Team

Team of 3 specialists worked on this project:

-­ Project manager ­communication with customer, distribution and control of tasks;

-­ Web developer ­ development of the project;

-­ Tester ­ test of the project;

 

Contact us
Technologies we work with
AdMobAjaxAmazon CloudFrontAmazon SESAmazon SQSAndroid SDKAngularJSAS2 Serverauthorize.net APIBootstrapBullet PhysicsC #Cocos3DCodeigniterCommercehub.comCoreDataCoreGraphicCoreLocation Android SDKCS CART APICSS
Cybersource.comDesignDrupalEDIEJS templateExpressExtJSFacebook APIFedex APIFedEx SOAP servicesFlickr APIFuelPHPGitGoogle MapsGoogle PageSpeedGoogle Tag ManagerGooglePlayServiceSdkGsonHTMLHTML5 Canvas
ImageMagickinAppPurchaseiOS SDKJavaJavaScriptjQueryjQuery Grid PluginjQuery MobilejQuery UIJS custom scriptsJSONLaravelLESSLotusFormulaLotusScriptMailchimp APIMemberMouseMemcacheMinifyMongoDB
MS SQL ServerMySQLNivoSliderNodeJSObjectiveCPayleap APIPayPal APIPerfect MoneyPGP encryptionPHPPrototype JSPSA APIQODBCRedisRestApiRollbar APISmartySMS gatewaySOAPSocket.IO
Sphinx Search ServerSSOSVNTinyPNG APIVarnishWkhtmltopdfWooCommerceWordPressXLSXMLXML parserXSLTYiiYoutube APIzip2tax.com API