uPlanet | PDF Analyser

Clear all technologies

Graphic DesignPHP

Goals

The purpose of the project is to help moderator without reviewing of pdf files to determine that this file is a magazine and take it to attribute it to a specific category. For convenient work, the user must create a list of tags for search. These tags are grouped into categories. Next pdf files are loaded and the parser counts the number of tags in the document. The result is a list with a preview (first page) and additional information (number of pages, the original title, etc.)

The user can also create filters to produce certain results. For example, the file must contain the word "tree", but does not contain the word "maple" or magazine must contain at least 20 words "fashion" and then it goes into a certain category. The user can also view a list of parsed files and if for some reason it did not get into magazines manually assign it to a specific category. And then upload the file list with the names and additional information.

Progress: at the moment the parser with draganddrop file upload and preserving the history of parsing, which displays a list of files with preview is ready; it also counts the number of tags and pulls the metadata from a file.

Solution

For the realization of this project, we use Laravel 5 PHP framework. Xpdf C ++ library that allows us to pull out text, images, metadata. To remove protection from protected files was used Ghostscript.

The parser works pretty quickly: 50 random files from 1MB to 80MB (with and without protection) work out for about 1 minute.

Development was carried out locally for Windows, but can be adapted for Linux and MAC.

Technologies

Laravel 5, Хpdf, Ghostscript, PHP

Team

Team of 3 specialists worked on this project:

- Project manager communication with customer, distribution and control of tasks;

- Web developer development of the project;

- Tester test of the project;

If you have any question, pleaseContact us

Technologies we work with

3rd party servicesAdMobAjaxAlamofireAmazon CloudFrontAmazon S3Amazon SESAmazon SQSAmazon Web ServicesAmChartsAndroid SDKAngularJSAS2 Serverauthorize.net APIaws sdkBootstrapBranchBullet PhysicsButterknifeC #CalligraphyCocos3DCodeigniterCommercehub.comCoreDataCoreGraphicCoreLocation Android SDKCrashlyticsCS CART APICSSCssminCssselectCybersource.comDagger2DjangoDockerDrupalEDIEJS templateElectronEmberJSEspressoExpressExtJSFabric.jsFacebook APIFacebook SDKFedex APIFedEx SOAP servicesFFMPEGFlickr APIFuelPHPGitGlideGoogle MapsGoogle PageSpeedGoogle ServicesGoogle Tag ManagerGooglePlayServiceSdkGraphic DesignGsonHTMLHTML5 CanvasImageMagickinAppPurchaseiOS SDKJavaJavaScriptjQueryjQuery Grid PluginjQuery MobilejQuery UIJS custom scriptsJSONJUnitKeyChainAccessKingfisherLaravelLESSLottieLotusFormulaLotusScriptMailchimp APIMemberMouseMemcacheMinifyMixpanelMongoDBMS SQL ServerMVCMVPMySQLNivoSliderNodeJSObjectiveCObjectMapperPayleap APIPayPal APIPerfect MoneyPGP encryptionPHPPicassoPrototype JSPSA APIPureLayoutPWAPython twitterPython2QODBCQuickbloxReactJSRedisRestApiRetrofitRollbar APIRxJavaSmartySMS gatewaySOAPSocket.IOSphinx Search ServerSSOStripe APISVNSwiftTimbertinycssTinyPNG APITravelNXT APIVarnishVolleyVueJSWkhtmltopdfWooCommerceWordPressXLSXMLXML parserXSLTYelp APIYiiYoutube APIzip2tax.com API

We use cookies to give you the best experience possible. By continuing we'll assume you're on board with our
Cookie Policy Privacy Policy Terms & Conditions.