page loader
Close mobile menu
+1(720)951-9470
show-all-works-button.pngshow-all-works-active-button.png
PDF Analyser
pdf-analyser_slider_1.png
pdf-analyser_slider_2.png
pdf-analyser_slider_3.png
pdf-analyser_slider_4.png
pdf-analyser_slider_5.png
technologies-image.pngGraphic DesignPHP

Goals

The purpose of the project is to help moderator without reviewing of pdf files to determine that this file is a magazine and take it to attribute it to a specific category. For convenient work, the user must create a list of tags for search. These tags are grouped into categories. Next pdf files are loaded and the parser counts the number of tags in the document. The result is a list with a preview (first page) and additional information (number of pages, the original title, etc.)

The user can also create filters to produce certain results. For example, the file must contain the word "tree", but does not contain the word "maple" or magazine must contain at least 20 words "fashion" and then it goes into a certain category. The user can also view a list of parsed files and if for some reason it did not get into magazines manually assign it to a specific category. And then upload the file list with the names and additional information.
Progress: at the moment the parser with drag­and­drop file upload and preserving the history of parsing, which displays a list of files with preview is ready; it also counts the number of tags and pulls the metadata from a file.
 
Solution
For the realization of this project, we use Laravel 5 PHP framework. Xpdf  C ++ library that allows us to pull out text, images, metadata. To remove protection from protected files was used Ghostscript.
The parser works pretty quickly: 50 random files from 1MB to 80MB (with and without protection) work out for about 1 minute.
Development was carried out locally for Windows, but can be adapted for Linux and MAC.
 
Technologies
Laravel 5, Хpdf, Ghostscript, PHP
 
Team
Team of 3 specialists worked on this project:
-­ Project manager ­communication with customer, distribution and control of tasks;
-­ Web developer ­ development of the project;
-­ Tester ­ test of the project;

 

Contact us
Technologies we work with
3rd party servicesAdMobAjaxAlamofireAmazon CloudFrontAmazon S3Amazon SESAmazon SQSAmazon Web ServicesAndroid SDKAngularJSAS2 Serverauthorize.net APIaws sdkBootstrapBranchBullet PhysicsButterknifeC #CalligraphyCocos3DCodeigniterCommercehub.comCoreDataCoreGraphicCoreLocation Android SDKCrashlyticsCS CART APICSSCssminCssselectCybersource.comDagger2DjangoDockerDrupalEDIEJS templateElectronEmberJSEspressoExpressExtJSFabric.jsFacebook APIFacebook SDKFedex APIFedEx SOAP servicesFFMPEGFlickr APIFuelPHPGitGlideGoogle MapsGoogle PageSpeedGoogle ServicesGoogle Tag ManagerGooglePlayServiceSdkGraphic DesignGsonHTMLHTML5 CanvasImageMagickinAppPurchaseiOS SDKJavaJavaScriptjQueryjQuery Grid PluginjQuery MobilejQuery UIJS custom scriptsJSONJUnitKeyChainAccessKingfisherLaravelLESSLottieLotusFormulaLotusScriptMailchimp APIMemberMouseMemcacheMinifyMixpanelMongoDBMS SQL ServerMVCMVPMySQLNivoSliderNodeJSObjectiveCObjectMapperPayleap APIPayPal APIPerfect MoneyPGP encryptionPHPPicassoPrototype JSPSA APIPureLayoutPWAPython twitterPython2QODBCQuickbloxReactJSRedisRestApiRetrofitRollbar APIRxJavaSmartySMS gatewaySOAPSocket.IOSphinx Search ServerSSOSVNSwiftTimbertinycssTinyPNG APITravelNXT APIVarnishVolleyWkhtmltopdfWooCommerceWordPressXLSXMLXML parserXSLTYelp APIYiiYoutube APIzip2tax.com API