Contribute.xml flagged as vulnerability
Hi,
I have been asked by a customer to research a vulnerability that has been flagged by McAffee Secure on their website. They report that contribute.xml contains info that would be helpful to a hacker and to restrict access to that file. So my question is, can I simply put a password on that file? Will the software still work Ok? Has anyone else seen this error? I have tried searching the web and did not see anyhting relating directly to this issue.
My apologies if I am asking a dumb question or am in the wrong forum, but I am getting a little desperate at this point and I'm not really sure where to turn for the answer.
Thanks in advance for any help.....
@cdmweb - have you discovered a solution to this?
@ ADOBE TECH SUPPORT: When I called, you indicated that telephone support for Contribute 4 is not offered for this product and that I should create a ticket in the support forums. Here's another user with the exact same problem and he's waited OVER 2 MONTHS for an answer.
Every time I rename or remove this file, it gets recreated and I can't get rid of the problem for good. I've searched your support forums, searched 3rd party forums, search Google, called McAfee and called an Adobe dealer and nobody has an answer. This is a security vulnerabilty that your product creates, and I've got to believe that other larger clients have requested a fix to this problem. Can you please provide a way to solve this problem one and for all?
Similar Messages
-
I am experiencing a problem with Contribute CS5 and the editing of XML files.
Previously in versions CS3 and CS4, browsing to an XML file and clicking "Edit Page" would bring up the code in an external editor where we could make our changes and then publish to the Web site. This happened seamlessly on both the Windows and Macintosh platforms using Notepad/TextEdit respectively.
We noticed the other day that no such editing is possible with CS5, where instead we receive an error or no feedback at all. On Windows the XML loads for viewing but clicking on Edit Page results in the following error:
"XML validation failed. This could be because your document has an invalid structure, or Contribute doesn't support the encoding mentioned in the XML. Contact your administrator for details."
On Macintosh the page comes up blank but clicking "Edit Page" gives the same pop-up error.
I used an XML validator to make sure that my code is correct, as as noted the same code worked fine in previous versions of Contribute. Is there a setting we can change or some preference that will help resolve the problem? If further information needed please let me know, otherwise thanks in advance for any help in finding a solution.Thank you kindly. What we're trying to do is edit the left-hand navigation on a Web site which is written in XML. The Web site is the following:
http://www.sandiego.edu/dining/
The navigation file is accessible at the following URL, and was previously editable via Notepad/TextEdit (via Contribute) as noted in my earlier e-mail.
http://www.sandiego.edu/dining/subnav.xml
I believe Contribute uses a Safari engine which then explains why the XML code is not visible in Contribute CS5 on Macintosh. On Windows it does come up, but then results in the XML validation error previously described. Please advise if there's anything you spot on your end, and likewise if any further clarification needed just let me know. -
If you need help with CPS and Contribute
After over a month of contacting support, I have come to the conclusion that Adobe does not offer advance support for CPS. The only thing that the low level tech support agents are capable of doing is
1. transferring you to someone else,
2. Sending you links to the online manual and livedocs that you have probably already read.
After spending over a month and roughly 100+ hours I have determine that CPS cannot handle multiple client certificates sent via a Department of Defense Common Access Card. I did learn the following info that you might find useful.
1. Contribute must be able to write a file to the root (when you 1st connect it writes TMPjaisd8ua89ua.htm then deletes it) If this failes for any reason, contribute gives you the ever so helpful could not connect error
2. contribute must be able to write / delete files in root/_mm
3. If you want users to send drafts contribute must have access to root/MMWIP
4. If you are using Contribute CS3 or newer and you remove a connection from a computer, the connection isn't really removed 100%, you must go into regedit, current user, software, adobe, contribute41 and clear out keystore, trusted CPS Server, Admin Servers. Just leave the defaults. This might be different for version 5
5. Work with an LDAP admin to get the correct LDAP settings so CPS can view the directory
6. If using multi LDAP servers ensure that replication is working correctly on them all otherwise your users won't be able to access the site if connected to a faulty LDAP server. Also make sure users can get to the directory via windows if using local/Network connection
7. Work with Network admin to have security logging enable on the web server so anytime contribute gives an error they can check the log to see exactly what your account was trying to do at the time of the error.
8. Read the Manuals, in fact memorize them.
http://help.adobe.com/en_US/Contribute/4.1_CPS/help.html?content=con_overview_ov_3.html
http://help.adobe.com/en_US/Contribute/4.1_CPS/help.html?content=con_setup_install_si_18.h tml
http://livedocs.adobe.com/contribute/311/deploying_en/wwhelp/wwhimpl/js/html/wwhelp.htm?hr ef=00000049.htm#134387
I hope this will shed a little light on CPS and contribute or if not then at least give you some direction on how to debug your problems since Adobe will never help you.Addition:
9. Create a folder on your desktop called "ctnetperformancelog" and then launch Contribute. Information logged in these files can provide invaluable insight in pinpointing where problems originate.
For example:
135 07/05/11 12:31:53 PM mmoreno netio/LAN PutData ACCESS_DENIED 5 5 0.0017 file://webdev/Dev/ /Dev/_mm/contribute.xml
This was the result of /_mm/contribute.xml inadvertently being marked read-only due to a dreamweaver misconfiguration.
See http://kb2.adobe.com/cps/195/tn_19506.html for more details.
Don't forget to delete or rename this folder (e.g. "ctnetperformancelog-DISABLED") after debugging since the log files can grow quite large. -
Dreamweaver & Contribute Administration Problem
I've got an odd scenario for a client site that I
inherited after several other developers were involved over the
past 2 years... I fear that the one who created the Contribute
Administration interface is the one where there are some other
legal problems involved, and there is no relationship left. If I
contacted them, they would not cooperate. Sad but true.
What I want to do is remove the previous Contribute linkage
and then reinstall a new Contribute site administration that I can
drive henceforth. I use Dreamweaver MX 2004 (still) and have no
problem at all in accessing a client site via FTP, but the second
problem is that I don't actually have total access to the hosting
server panel as usually is the case. So, if this is going to
involve deleting specific files, I'll need to know what to look for
(if it's within my view) or what to direct the hosting contractor
to find & delete.
I presently update all content on the existing site design,
but that's not very efficient and it's expensive at my rates for me
to doing the day-to-day content update that the client could easily
manage themselves via Contribute. Every time I first access the
server via Dreamweaver FTP to upload new files, I'm prompted about
enabling Contribute connectivity. I have Contribute, but don't have
this client's Contribute Administration keys or details. Those are
forever lost, I'm afraid.
I've recently redesigned the site, and it would be ideal to
enable my client to use Contribute. However, I'm caught in a loop
since there is no way to access the orignal Contribute site keys.
How can I find and remove the existing Contribute Administration,
then reinstall Contribute with new Contribute Administration of the
site? If I try to administer the site via Dreamweaver (Manage Site
- Contribute tab), once it checks through the cache, I get an error
that says "Could not find a page at the specified site root URL.
Please check the site root URL." I'm wondering if the previous
developer may have actually deleted something just to be nasty...
they were pretty vile and unprofessional about a lot of things.
This is far from the typical scenario I'm used to where I
always have total control over the hosting environment, too.
However, relative to an e-commerce component of their site, when
I've had issues and knew what to ask for, the site hosts have been
pretty responsive. I might be able to ask for specific files to be
removed if I could only know what to direct them to do.
Any ideas? Many thanks.> open up a third-party FTP client and log into the remote
server.
> Look for a folder named _mm that contains a file named
contribute.xml
>
> delete that file and folder.
Alternative:
download the contribute.xml file and open it in a text
editor.
There is a way to reset the contribute Admin password if you
get this far
and want to leave the setup as-is but don't know the
password.
i can google for the answer i've given in the past, short
version of what i
remember, look in the .xml file for
password="randomcharacters" and change
it to password="" meaning password has no contents.
upload the contribute.xml file to same location on the remote
server.
then login using contribute admin, and when prompted for
password don't type
anything, just hit enter key. Then reset the password.
Alan
Adobe Community Expert, dreamweaver
http://www.adobe.com/communities/experts/ -
Extend xml for orders by BBP_SAPXML1_OUT_BADI
Hi,
We need to extend xml order export with specifical xml flag. We have some samples for extend catalogs (extend by structure TYPE BBPX1_CATALOG_TRANSMISSION), but we dont have such a structure in the order badi (the structure TYPE BBPX1_PURCHASE_ORDER_MESSAGE do not have any table type bbpx1_item_str to fill specifical flags).
Could you help us ?
Thanks in advance,
Vincent Tanguy10.06.2010 - 10:08:01 CET - Reply by SAP
Dear customer
please remark note 806127 which describes how to swap user-defined
fields using SAP-PI/XI. This notes was created based on SRM4.0/5.0 but
its contect is as well valid for SRM6.0/7.0.
The exchange of data with the SRM-application is realized by customer
fields described in note 672960. This is the SRM70 solution instead of
old customer fields described in note 458591.
According to note 806127 there are the following steps required
1.) Use customer enhancements in the Integration Repository to enhance
the interfaces correspondingly
for details please see
http://help.sap.com/saphelp_nw70ehp1/helpdata/en/14/80243b4a66ae0ce10000000a11402f/frameset.htm
and follow this path
Design and Configuration Time
Design
Designing Interfaces and Proxy Generation
Developing Message Interfaces
Data Types
Data Type Enhancements
2.) Generate proxies in the SRM system
3.) Implement the BAdIs in the SRM system (only when the interface
enhancements do not have the same name as the customer fields)
In case the new field for your flag has same name in interface
enhancement and as customer field then the data transport is done
automatically (move-corresponding like). And you do not have to
implement the BBP_SAPXML1_OUT_BADI
kind regards
Andreas Schaeff
IMS SRM development support Germany -
Can't End Contribute with a site
I enabled Contribute on a web site and now I want to stop it.
I removed it from 'MY Connections', but dreamweaver still thinks
that it is linked to Contribute. Any ideas on the Dreamweaver side
or Contribute side?You'll have to also delete the _mm/contribute.xml file that
is present on
the remote site. In DW8, you can use View | Hidden files,
which will allow
you to see that folder.
Murray --- ICQ 71997575
Adobe Community Expert
(If you *MUST* email me, don't LAUGH when you do so!)
==================
http://www.dreamweavermx-templates.com
- Template Triage!
http://www.projectseven.com/go
- DW FAQs, Tutorials & Resources
http://www.dwfaq.com - DW FAQs,
Tutorials & Resources
http://www.macromedia.com/support/search/
- Macromedia (MM) Technotes
==================
"John Waller" <[email protected]>
wrote in message
news:ef52et$jc0$[email protected]..
> Have you tried:
>
> Site > Manage Sites > Edit
>
> Under Category, click on Contribute and uncheck "Enable
Contribute
> compatibility"
>
> --
> Regards
>
> John Waller
> -
Cannot connect to website with adobe contribute cs5
I am using Adobe Contribute to edt my website and cannot connect and it has been working fine for 2 years. whenerver i try to publish a page is reads "server is not accepting connections" even though I am connected to the internet
> Can you please tell me how I can delete
> the old settings as they must be embedded somewhere in
my system
as a guess- it's the admin settings on the remote site.
Are you the only person using Contribute on that site?
If yes- using some FTP client, connect to the remote server.
Look for a folder named _mm in the root level that contains a
file named
contribute.xml
delete it [or rename it]
now try to create a new connection. {or have the admin make
new admin
settings, and then create and send you a connection key}
Alan
Adobe Community Expert, dreamweaver
http://www.adobe.com/communities/experts/ -
Getting Permission error in Contribute CS3
This is going into the 3rd week of being down due to lack of support from Adobe.
Our contribute server is set up to use active directory to validate the login. That works fine.
It fails when trying to validate the FTP loging to the webserver. I have it set that everyone shares the same FTP login.
This was working fine for about a year and then we started getting this error:
[Your permissions on this website have been revoked and you cannot edit pages on this website]
The known login and password work successfully in other software from that server.
We've ruled out the firewall. (packet sniffing shows no traffic from the server)
We've ruled out the antivirus software on the server. (turned it off and problem persist)
We checked the PSAPI.DLL . (there wasn't one in the contibute directory, copied sys32 one there)
We deleted the contribute.xml file from the webserver _mm folder (no difference)
I CANNOT login as ADMINISTRATOR.
I would like to know what files to remove (without uninstalling) so we can start from scratch.
Thanks, JimWe just moved our website to a SFTP server and I am having the exact same problems. This is incredibly frustrating. I am going to try the the numeric URL and see if that helps. If no, I will ask everyone to install CS4. Thanks for the help!
Kelly -
Missing file for Contribute compatibility
I have a site where two of us are using Dreamweaver and
others are using Contribute. We want to all be checking files in
and out. The other person using Dreamweaver gets this message
whenever he checks a file in or out:
The file required for Contribute compatibility does not exist
on the sewrver, Would you like to turn off Contribiute
compatibility?
He's been clicking NO and everything is working fine, but
it's a real annoyance and we can't find any information about what
file that would be or how to get it on the server.
Let me reiterate: It's WORKING - files get checked out, files
get checked in, no one is stepping on anyone's toes. We're just
dealing with a really irritation error message.
Clues?
Thanks!> The file required for Contribute compatibility does not
exist on the sewrver,
> Would you like to turn off Contribiute compatibility?
as a guess- that user has the remote site defined to a
different folder
level. So dreamweaver isn't finding the _mm/contribute.xml
file at the
defined site root.
does this hosting require public site files to be in a
specific directly
like www or public_html or httpdocs ? If yes- it may be that
they are
logged into the private ftp root instead of using the Host
Directory line in
the site def. to put them in the correct directory. -
How do you get rid of Check in/out RESIDUE?
On one site, I keep getting the messages about other users
using the site,
please enable check in and out, etc. When I first opened this
site I had
enabled Contribute and Check in and out, but I disabled it
because the
client never needed it.
Now however, I can never seem to get rid of these warnings
even though I
have cleaned up cache, disabled maintain design notes,
re=enabled design
notes, removed Contribute abilities,etc.
How do I stop these warnings???
Thanks
Jeff
~~~~~~~~~~~~
Jefferis Peterson, Pres.
Web Design and Marketing
http://www.PetersonSales.comHi Alan,
Okay, I went via ftp to the site and found plenty of _bak
folders and LCK
files, but no _mm folders. I deleted what I found.
The only thing checked on is "maintain synchronization info"
and I'm
wondering if that is causing the problem. And "Maintain
Design Notes" is
checked on, but uploading for sharing is not.
Do you think the residual lck files are what caused the
problem?
> to remove the Contribute enabled info- either fire up
contribute, go to
> Admin Site, and click Remove. Or- connect to site with
3rd party ftp
> client. find the _mm folder that contains contribute.xml
and delete it.
>
> optional is to delete the _bak folder. Those are
previous versions of edited
> pages.
>
> this will remove all contribute user's connection keys.
>
> then double check in the site definition and turn Off
checkin/checkout
>
> Then you could either hunt down and delete any .lck
files on the remote and
> local sites- or kill them one by one as needed when
opening them; using the
> Override Lock File button. Or right click on the site
folder in the Files
> Panel, and pick Unlock.
>
>
~~~~~~~~~~~~
Jefferis Peterson, Pres.
Web Design and Marketing
http://www.PetersonSales.com -
Bad console export, worse documentation for session replication setup
Hi,
I'm having problem configuring config.xml/weblogic.xml to get in-memory session rep going for WL 6.0
sp1.
I already have everything up and running in 5.1 sp8; so I figure I can just export my
weblogic.properties settings to 6.0's config.xml via the console, right? Well, here's my original
properties:
weblogic.httpd.session.enable=true
weblogic.httpd.session.cookies.enable=true
weblogic.httpd.session.timeoutSecs=10000
weblogic.httpd.session.cookie.comment="Kiko session tracking cookie"
weblogic.httpd.session.cookie.domain=.kikotest.com
weblogic.httpd.session.cookie.maxAgeSecs=-1
weblogic.httpd.session.persistence=true
weblogic.httpd.session.cacheEntries=1024
weblogic.httpd.session.persistentStoreType=replicated
When I use the console export tool, I find NONE of these settings in config.xml!!! So I look
through the docs to see if I can do this manually, and the documents are just as confusing! This
mapping table:
http://e-docs.bea.com/wls/docs60///////config_xml/properties.html#1152226
seem to suggest that each session flag from weblogic.properties is mapped to two corresponding flags
in config.xml and weblogic.xml (e.g. persistentStoreType -> (config.xml) SessionPersistentStoreType
& (weblogic.xml) PersistentStoreType). Does this mean I have to set the flags for BOTH files? Or
just one? Which one???
As for config.xml flags which have "N/A" in the "Console Label" column, does this mean I just add
this flag as a top-most label, like:
<SessionPersistentStoreType>replicated</SessionPersistentStoreType>
And to further add to the confusion, when I look at the document for "Configuring In-Memory HTTP
Replication in a Cluster" http://e-docs.bea.com/wls/docs60/cluster/servlet.html#1009453, it tells me
to "set the property PersistentStoreType to replicated in the Web Application deployment descriptor,
web.xml."? I thought it was weblogic.xml???
I think all of these confusions can easily be alleviated by fixing any one of the following, if not
all three:
1) /console export of weblogic.properties to config.xml needs to translate these session flags
2) /console needs to have access to these session flags so we don't have to poke around config.xml!
3) Documents need to be less ambiguous about how to set these flags
GeneHi Gene,
I believe the documentation is incorrect - sessionPersistentStoreTyped doesn't need to be set in the
config.xml file (I don't believe this tag still exists). The file name should be weblogic.xml and has
the following general form:
<!DOCTYPE weblogic-web-app PUBLIC "-//BEA Systems, Inc.//DTD Web Application 6.0//EN"
"http://www.bea.com/servers/wls600/dtd/weblogic-web-jar.dtd">
<weblogic-web-app>
... other stuff from the dtd if you like
<session-descriptor>
<session-param>
<param-name>PersistentStoreType</param-name>
<param-value>replicated</param-value>
</session-param>
</session-descriptor>
</weblogic-web-app>
weblogic.xml goes in the WEB-INF directory in your war file.
Hope this helps,
Glen
Gene Chuang wrote:
Hi,
I'm having problem configuring config.xml/weblogic.xml to get in-memory session rep going for WL 6.0
sp1.
I already have everything up and running in 5.1 sp8; so I figure I can just export my
weblogic.properties settings to 6.0's config.xml via the console, right? Well, here's my original
properties:
weblogic.httpd.session.enable=true
weblogic.httpd.session.cookies.enable=true
weblogic.httpd.session.timeoutSecs=10000
weblogic.httpd.session.cookie.comment="Kiko session tracking cookie"
weblogic.httpd.session.cookie.domain=.kikotest.com
weblogic.httpd.session.cookie.maxAgeSecs=-1
weblogic.httpd.session.persistence=true
weblogic.httpd.session.cacheEntries=1024
weblogic.httpd.session.persistentStoreType=replicated
When I use the console export tool, I find NONE of these settings in config.xml!!! So I look
through the docs to see if I can do this manually, and the documents are just as confusing! This
mapping table:
http://e-docs.bea.com/wls/docs60///////config_xml/properties.html#1152226
seem to suggest that each session flag from weblogic.properties is mapped to two corresponding flags
in config.xml and weblogic.xml (e.g. persistentStoreType -> (config.xml) SessionPersistentStoreType
& (weblogic.xml) PersistentStoreType). Does this mean I have to set the flags for BOTH files? Or
just one? Which one???
As for config.xml flags which have "N/A" in the "Console Label" column, does this mean I just add
this flag as a top-most label, like:
<SessionPersistentStoreType>replicated</SessionPersistentStoreType>
And to further add to the confusion, when I look at the document for "Configuring In-Memory HTTP
Replication in a Cluster" http://e-docs.bea.com/wls/docs60/cluster/servlet.html#1009453, it tells me
to "set the property PersistentStoreType to replicated in the Web Application deployment descriptor,
web.xml."? I thought it was weblogic.xml???
I think all of these confusions can easily be alleviated by fixing any one of the following, if not
all three:
1) /console export of weblogic.properties to config.xml needs to translate these session flags
2) /console needs to have access to these session flags so we don't have to poke around config.xml!
3) Documents need to be less ambiguous about how to set these flags
Gene -
Utl_file.file_type / ora01041:hostdef does not exist
Hello all,
I have got a problem with ora01041:hostdef does not exist.
My package compiles well untill he reaches a procedure with an declaration with the utl_file.
I get above error and Oracle terminates unexpectedly.
The stranges thing is that this only happens when executed in batch.
When I start SQL and then execite this part of the procedure everything goes well.
Help!!!After disabling Contribute compatability I changed the name
of the site root and renamed it so that it is not the same as what
is found on the remote server. I am still receiving the error
message, even after the name change and re-applying Contribute
compatability. It should not ... not... be this difficult.
When I uploaded an html file from the local site (on my PC)
to the server (an a different device) i received this error
message. I checked the server and there IS a contribute.xml file
within the _mm directory within the remote site structure. The date
and time stamp on the contribute.xml file within the _mm folder
indicates that it (the contribute.xml file) was indeed updated
today (at the same time as the html file being uploaded). Therefore
it shows that the file IS being located, recognized, updated and
saved with a new date/time stamp. It is not missing. That makes me
question the validity of the response received to this inquiry.
Also, the response is proactive in nature. It indicates what
should not be done in advance of creating a site. The response does
not provide the steps necessary to resolve the error once it has
occured.
Any other suggestions or do I just have to live with this
flaw in the software? This is really, really annoying as I have to
be sure to click No each time a file is uploaded and the error
message is received. Do I have to remove Contribute from my PC and
all files from the server that are Contribute related and reinstall
Contribute? Re-create all user profiles? I've had to do that with
another problem that I had with a different site that used
Contribute, all to no avail. in that case the only (only) way I
could get a specific end user set up and working in Contribute was
to make them an administrator (which is not acceptable though we
are living with *that* condtion as well). I am quickly losing
confidence in this product! -
IDoc tunneling (Parameter XML_CONVERSION)
The parameter in the SXMB_ADM transaction defines whether the IDoc is transported as a table and is not converted to IDoc-XML in the IDoc adapter. This is only recommended if IDocs are received and sent as IDocs in the Integration Server. If none of the services in the Integration Engine use IDoc-XML, you can avoid unnecessary conversion from and to XML, thereby improving system performance.
Possible Values
'<b>0</b>' Every IDoc is saved as a table; no IDoc-XML conversion.
'<b>1</b>' Every IDoc is converted to IDoc-XML.
'<b>2</b>' An IDoc is converted to IDoc-XML only <i>if requested by the service</i>.
The question is: does anybody know what "<i>if requested by the service</i>" exaclty means ?
Regards,
SandroThink about a scenario SAP-XI -JMS (or Flat file) vice versa . I will do the transformation with ABAP-Mapping in both direction and theres no need for receiveing an XML-Stream from IDOC-Adpater or send an XML-stream to IDOC-Adpater.
We would receive a better performance if we could avoid the expensive XML-Transformation.
There is a check of an existing Mapping for Sender-Message in the Table SMPPREL3 .
There could be an special XML-flag in this Table and in the Receiver-Determination you could set this flag or not .
XI uses CL_IDX_IDOC_RESOURCE for the IDOC-Tunnel. In this class EDIDC and EDIDD are serialized in an XSTRING . In my ABAP-Mapping first i would check Content_Type via PARAM->GET. For transforming the binary Content into EDIDC and EDIDD a static method should be implemented by SAP in CL_IDX_IDOC_RESOURCE . I think I could implement such a method myself, but its risky if SAP changes their ITAB_TO_BINARY-method in an incompatible way.
In the XI-SAP -direction I would like to set the content_type in ABAP-Mapping to bin and convert the EDIDC and EDIDD in X-string (like in CL_IDX_IDOC_RESOURCE ->ITAB_TO_BINARY ).
Regards Josef -
Spry data sets and column width
Hello,
might sound simple ... I have a dynamic spry data table. The
data show fine ... But I don't seem to be able to set the width of
my columns through CSS (I give each column a separate style class,
and all other properties work fine, except the width of the column)
Anybody a view ?
Thanks a million for your help.
Peter<script type="text/javascript">
<!--
var Flags = new Spry.Data.XMLDataSet("../XML/flags.xml.php",
"ValidatorList/Validator_item");
//-->
</script>
<div class="Tbl" spry:region="Flags">
<table>
<tr>
<th class="ID" spry:sort="@id">Id</th>
<th class="TblC" spry:sort="Delete">Delete</th>
<th class="TblL"
spry:sort="URL_Display">Link</th>
<th class="TblL">Status</th>
<th spry:sort="Visible">Visible</th>
<th spry:sort="HTML">Html</th>
<th spry:sort="Content">Content</th>
<th spry:sort="Hits">Hits</th>
<th spry:sort="Pct">%</th>
<th spry:sort="Time">Time</th>
</tr>
<tr class="TblSummary">
<td class="ID" >IDS</td>
<td class="TblC"> </a></td>
<td class="TblL"> </a></td>
<td class="TblL"> </span></td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
</tr>
<tr spry:repeat="Flags" spry:odd="Odd" spry:even="Even"
spry:hover="Hover" spry:select="Select">
<td class="ID" >{@id}</td>
<td class="TblC"><a href="#"
onclick="{Delete}"><img src="../Images/icons/cancel.png"
/></a></td>
<td class="TblL"><a
href="{URL_Link}">{URL_Display}</a></td>
<td class="TblL"><span
class="{Status_Style}">{Status}</span></td>
<td><a href="{Visible_Pre}"><img
src="../Images/icons/{Visible}.png" /></a></td>
<td><img src="../Images/icons/{HTML}.png"
/></td>
<td><a href="{Content_Pre}"><img
src="../Images/icons/{Content}.png" /></a></td>
<td>{Hits}</td>
<td>{Pct}</td>
<td>{Time}</td>
</tr>
</table>
</div>
and the css style sheet says ....TblL
text-align: left;
padding-left: 0px;
width: 150px; -
Pdftotext extracting from image files mystery
hello all, just had a bit of a shock when I ran pdftotext (accidentally) on an unocr'd pdf file, and it extracted all the text. Running pdftohtml (as I'd intended) produced the expected output - i.e. png dumps. Curious, I then tried running it on a bunch of other downloaded files, and with the exception of one, pdftotext extracted the text from ALL of them. The mystery then is that it isnt using any ocr (no tesseract dependency, plus it's way too quick), so clearly it must be pulling it directly from the files. But if the text really is there in the original, then presumably the people who scanned them didn't know, else they'd have left it there. The only thing I can think is that some common piece of pdf software (probably acrobat) has ocr built in, so it's scanning them automatically and then encoding the hidden text in the pdf.
take for example (legal):
http://www.cd3wd.com/cd3wd_40/JF/JF_VE/SMALL/27-714.pdf
open up in a normal pdfreader - very clearly a (poorly) scanned document. Now run 'pdftotext -layout' on it and you get a pretty impressive text file, considering the source. Sure, some of the formatting is messed up, but I'm sure anyone with sed knowledge could quickly sort most of that out. Besides, that's one of the worst documents - on most others it was perfect, maintaining columns (even tesseract can't do that) and everything.
It makes for a fantastic command line pdf reader that works on almost all the files I've thrown at it, and since pdf's are almost the only reason I ever have to load up X, consider me tickled pink.
Just thought I'd mention it here, as a google search brings up nothing on the issue. hope someone finds it usefulYes, Adobe Acrobat has a feature that allows you OCR the text and insert the text as a separate layer behind the image. Such PDFs are called "Searchable image PDFs". There are no doubt some other commercial software options that can do this too. The copier/scanner/multifunction device in my office actually does this automatically when you scan in a text PDF.
Theoretically, it is even possible to create PDFs like this using free software on linux (a combination of ExactImage and Cuneiform; or WatchOCR).
See, e.g., this blog post:
Searchable PDFs with linux
and this slashdot story:
Open Source OCR that makes Searchable PDFs
My own experiments with trying to get something like this to work have not been very successful--at least the quality of the result is nowhere near what I'm getting with our work printer.
If you like pdftohtml, you should also know about pdfreflow, which takes the XML output of pdftohtml with the -xml flag and creates an html file that can be reflowed (I.e., is smart about paragraph breaks, removes page numbers and recombines words broken through hyphenation and so on).
pdfreflow
Last edited by frabjous (2010-10-15 13:03:00)
Maybe you are looking for
-
How to manage marketing events and keep trask of subscribers ?
Hi all, I'm working on CRM request for proposals in France and I have more and more often one business requirement coming from the customers. They often asks to generates outbound campaigns for event that they organize. It could be Sales event, Marke
-
Will not boot up , was instructed to web site for help
hello I have a notebook Pavilion g7-131us that will not boot up . currently in the process of running self testing , all tests passed so far
-
Can't convert desktop to template because it's not 'available'
I have a desktop that I want to convert to a template. When I try I get this error: "Unable to Convert Desktop to Template Could not convert desktop 'test' to a template. The desktop state must be Available and the machine state must the Powered Off
-
I have thinkpad R60 series (with XP re-loaded) which have been running well since the last 6 years Yesterday when i was working on my PC as usual, the system went off after a power failure since the battery was in a drained mode. After the power res
-
File Content Conversion : Recordset Structure with * not reading file
Hallo all, we have the following problem. we implemented a scenario as sown in this blog /people/venkat.donela/blog/2005/06/08/how-to-send-a-flat-file-with-various-field-lengths-and-variable-substructures-to-xi-30 but there is a problem. If the file