Be careful with old versions!
These versions are displayed for reference and testing purposes. You should always use the latest version of an add-on.
Fix - Multiple selection for exporting automators was not working on Windows. This was corrected.
Enhancement - Works with Firefox 4 to 13.
Enhancement - Complete refactoring of all datasheets in the application (views and catch): they are now able to handle hundreds of thousands of rows.
Enhancement - Refactoring of the export functions to be able to handle very large volumes of data as well.
Enhancement - Fully compatible with Firefox 3.6 to 9.
Enhancement - More next page links are found, in more languages.
Enhancement - Scrapers were optimized, should be easier to produce and more forgiving.
Enhancement - When downloading files, an increment or a date was added to those with duplicate filenames. It is still the case, but when the url contains parameters, OutWit now adds these to the filename before testing for duplicates. This is more effective when the images come from databases.
Feature - Added a #nextPage# directive in scrapers, allowing you to tell OutWit Hub how to find the link to the next page in a series when it doesn't find it automatically. For the moment, this is only applied when the scraped view is active (bottom panel not at default settings).
Feature - Many new directives were also added to scrapers to help the debugging: #showSource#, #showMatches#, #showResults#, #showDelimiter#, #showScraperErrors#, #showServerErrors#, #simulate#, #showAlert#...
Feature - Added Lookup list replacement: #lookUp(value,listOfValuesToFind,listOfReplacementValues)# or #lookUp(value;listOfValuesToFind;listOfReplacementValues)# for replacing lists of values. The elements of the first list will be respectively replaced by those of the second.
Feature - Added new replacement functions: #(term1 operator term2)# and #if(condition,valueIfTrue,valueIfFalse)# or #if(condition;valueIfTrue;valueIfFalse)#. Works with the following operators: <,=,> (comparison operators); a=A (case-insensitive comparison); a==a (case-sensitive comparison); a!=b (not equal, case insensitive); a!==b (not equal, case sensitive); a+b (addition of integers: 1+3=4; concatenation of strings: out+wit=outwit; incrementing characters: c+3=f), a-b (subtraction of integers: 5-2=3 or decrementing chars: e-3=b ), a*b (multiplication), a/b (division) and a^b (power). The terms can be literals, variables or functions.
Feature - When using equality operators on strings (=, !=, ==, !==), you can now use the wildcard % in the second term to replace any string. (ex. these three statements are true: headstart = Head% ; homeland == h%d ; lighthouse = %HOUSE).
Feature - Added new replacement variables in scrapers: #URL#, #BASEURL#, #DOMAIN#.
Feature - Added the #nextToVisit(#myURL#)# function which, in the 'Replace' field, instructs the Hub to give the variable #myURL# the next scaped value which is not found in the list of visited URLs. This means that, used in conjunction with #nextPage# and #BACK# you can create complex scraping workflows. You can, in particular, create multi-level scraping processes.
Feature - Added the new directive #variable#myVariableName#. The occurrences of the variable (#myVariableName#) are replaced, at application time, by the scraped value in all other lines of the scraper.
Feature - Added URL alteration functions: #getParam(url,parameterName)# and #setParam(url,parameterName,parameterValue)#. When used with #URL# in the #nextPage# directive line, you can easily set the value of the next page url in some cases. ex.: #setParam(#URL#,page,#(#getParam(#URL#,page)#+1)#)# in the replacement field will generate the next url, incrementing the parameter 'page'.
Feature - The right-click menu on a scraper field in the scraper editor now allows you to highlight the matches in the source code. The same feature used on the description field, highlights matches for the whole scraper line.
Feature - Empty/Export/Download buttons were added to the datasheet bottom panels depending on the context, to export selected content.
Feature - It is now possible to completely disable images and plugins (like flash) in the Hub, for faster browsing: Use the right-click menu on 'page' in the side bar.
Feature - New settings were added to the time preferences, including temporization and pauses at set intervals for the fast scraping mode (XHR queries).
Feature - One major change is that scrapers can now reuse a same marker in several lines and use overlapping markers, which was not possible in the previous version.
Feature - Regular expressions can now be used in the find bar (ctrl-F or cmd-F) of the 'page', 'source' and 'scrapers' views! Just begin and end your patterns with "/" (i.e.: /yourRegularExpression/ ).
Feature - The right-click menu of the datasheets has changed: Automatic browsing and Scraper application are now gathered under "Auto-Explore Selected Links". In this submenu, you will find the exploration and scraping functions: 'Browse','Dig', Fast Scrape selected URLs (with the new possibility for the latter, to include the selected data fields in the scraping results) as well as the possibility to apply generic macros (i.e. macros that are not specific to a given URL) to the selected links.
Feature - You can now set the preferences so that FF runs OutWit Hub automatically on launch.
Fix - Corrected encoding problems in the Dynamic Source that could happen if the meta declaration was not UTF-8.
Fix - Corrected problem displaying some records with very large fields in the Detail panel.
Fix - Fast scraping works on very large selections of datasheet or catch rows.
Fix - Fixed the overwriting of existing files when manually saving export files.
Fix - Large number of fixes and performance optimizations throughout the code.
Fix - Several fixes in scrapers, in particular unwanted blank lines added in fast scraping mode.
Fix - The slideshow function now works even in on-demand images mode.
◦ Enhancement - Algorithm of exploration for high resolution images was optimized.
◦ Enhancement - Some minor enhancements and fixes in the scraper application algorithm.
◦ Enhancement - compatible with Firefox 7.0 and 8.0 (Beta, Aurora). Check the history for more frequent intermediate versions.
◦ Fix - multiple fixes and changes in scrapers: altered/optimized scraper algorithm to work with Firefox next versions (7 & 8), modified the cleaning of space characters. (Please report any unlikely negative changes in the behavior of your old scrapers.)
◦ Fix - Corrected a bug in query generation matrices with FF7+.
◦ Fix - Corrected a bug in the recognition of URLs in plain text and in the HTML export module.
◦ Fix - Corrected compatibility problem with version 4 of Firefox.
◦ Fix - Corrected problem in scrapers which could happen when the marker before contained only one character.
Enhancement - Now compatible with all versions of Firefox between 3.6 and 6 on all platforms.
Fix - Corrected error in console when dragging an item to the view list.
Fix - Fixed minor remaining problems in notification dialogs.
Fix - The 'Empty' checkbox was useless in the history view. It was removed. (A way will soon be provided to disable the history view.)
Enhancement - Bug report, suggestion and version history pages are now open as separate Firefox windows so that calling them doesn't disrupt your current work in the Hub.
Enhancement - High resolution image extraction was improved.
Fix - A bug was fixed that prevented to drag elements from the page to the catch when the 'save incoming files' checkbox was checked.
Fix - The current scraper was not saved when clicking on 'Execute' directly when a cell was still being edited. This was fixed.
Enhancement - Several fine-tuning corrections were made to the scraper application. They should not result in any noticeable differences in the extracted results, but will bring performance enhancements in some cases.
Enhancement - The default extension for Excel exports is back to '.xls' which works best on all platforms. (This can be changed in the preferences).
Fix - Fixed a bug on the #stop# directive.
Fix - Solved rare problems occurring at startup on some systems.
Fix - Several minor fixes.
Enhancement - Digging through local files and folders has been dramatically enhanced.
Fix - The opening of multiple html files was fixed.
Fix - The Dig depth problem when the option was entered through the advanced settings dialog was fixed.
Enhancement - The Next Page function recognizes more series.
Enhancement - Long texts are not truncated anymore in the HTML export, only partly hidden. This means that the data is there and, although not fully visible, can be copied and pasted.
Feature - Added the #replace# directive to scrapers.
Fix - Corrected problem occurring when multiple directives were used in a scraper.
Fix - File>Open multiple files now works with .htm extension.
Fix - Corrected a Dig depth issue that appeared when set through Dig's Advanced settings dialog.
Fix - Corrected a bug in the randomize function of jobs.
Enhancement - Further enhancements in Next Page link recognition algorithm.
Enhancement - Added the possibility to set Separator & Labels in the #repeat# directive.
Enhancement - In the HTML export, the data that was previously truncated for layout purposes is now only hidden, which means that it is present in the export page source code and can be selected and copied.
Fix - Labels ending with a digit are handled properly again in scrapers.
Fix - The extra space was removed before \0 replacement in scrapers.
Download this old version.
Feature - Added a preference to set the minimum number of rows an HTML table must have to be extracted by the 'tables' view.
Feature - Added an escape/unescape function in the right click menu of the scraper editor, which helps switching between literal content and a regular expression pattern.
Fix - Corrected cell cleaning of repeated fields in scrapers.
Fix - Several fixes in number normalization functions.
Fix - Minor fixes in advanced date parser.
Fix - In the 'tables' view, corrected a bug that made the factorization of cell labels into a column header fail when labels contained some characters (including $ or €).
Fix - The index increment function was not working properly when generating a new automator version number. This was fixed.
Fix - Corrected a bug when downloading files with strange characters, in particular, starting with a ".".
Fix - Corrected problems in the CSV export of some files containing strings of the form "#some characters:".
Enhancement - Scraped lines are now reordered when using fast scraping mode.
Enhancement - Automators can now be imported in several ways: opening several files at once, clicking on the link of an automator, etc.
Enhancement - Enhanced extended email recognition.
Fix - The download issue for files with the same name was corrected.
Fix - The 'Unresponsive script' issue in line 744 of overlay.js has been fixed.
Feature - A submenu now gives the choice of scraper to be appled to selected URLs. If no specific scraper is chosen, the scraper to apply is selected automatically, as before.
Fix - Fixed some problems in the the check for updates function.
Fix - Fixed a problem under Windows XP and Vista with Firefox 4 beta.
Fix - Fixed minor problems in the URL filter for browse and dig.
Fix - The automatic inclusion of hidden fields (like "Source url") if used in a macro is now working.
Enhancement - Delete/Select duplicates now work for multiple selections.
Enhancement - Select duplicates now ignores blank cells by default. (This can be changed in the preferences.)
Enhancement - The "TAB" key is now ignored in the scraper editor.
◦ Enhancement - Compatible with Firefox 4 beta 11.
◦ Enhancement - Right-clicking on a generation matrix now allows to directly apply a scraper to all generated URLs.
◦ Enhancement - The next page recognition algorithm was further enhanced to recognize series, reducing the risk of false positives.
◦ Enhancement - The on-demand image extraction mode now has a setting in preferences .
◦ Feature - A few instructions were added to the generation matrix format (groups and steps in ranges, #RANDOM[x:y]#).
◦ Feature - Lines can now be duplicated in the scraper editor (via the right click menu).
◦ Fix - A bug was corrected in the links view on a specific type of double encoded links.
◦ Fix - A bug was corrected that prevented new records to be created in some cases when the first line of a scraper included a Separator and a list of labels.
◦ Fix - A few bugs were corrected in generation matrices.
◦ Fix - A problem was fixed that occurred when a macro and a scraper had the same name.
◦ Fix - A regression was noted in the links view of version 18.104.22.168, please update to this version.
◦ Fix - Advanced settings bugs were corrected in the Dig function.
◦ Fix - All deduplication functions are working again in the right-click menu on the datasheets.
◦ Fix - An interface problem was corrected in the scraper editor that left a cell highlighted after editing.
◦ Fix - Apply scraper on a generation matrix now generates the URLs then does the apply scraper.
◦ Fix - Corrected bug in macros which prevented to save exported files when the overwrite option was selected.
◦ Fix - It is now possible to set tabs as separator in the preferences for csv export.
◦ Fix - RSS feeds are now found not only if links to feeds are present but on the feed pages themselves.
◦ Fix - Some locales were modified in English and French.
◦ Fix - Some regressions were fixed in next page link recognition.
◦ Fix - The Execute button of the scraper editor didn't work when the layout was set to side by side windows. This has now been fixed.
◦ Fix - The local help files were removed. All help pages are now online.
◦ Fix - The messages in the info bar are now more explicit when no scrapers have been found for the URLs.
◦ Fix - When a filter is set to the exploration or URLs (dig/browse), the filter doesn't apply to the URL of the current page any more, but only to the links within the start page.
Feature - An option was added in the preferences to include the ID column when exporting data to a file.
Feature - Scrapers: added #repeat#, #start# and #stop# directives for value repetition, hierarchical extraction in some cases and extraction start/stop.
Feature - Scrapers: added time variables in the replacement field.
Feature - A min / max setting was added to the time preferences for automatic exploration features, in order to allow random temporization in dig and browse.
Fix - A bug was fixed that appeared in some cases saving and deleting query directories.
Fix - The slideshow is working again when the automatic processing of images is disabled.
Fix - The "Save incoming files" checkbox of the catch is working again.
Fix - The outwit programs are working again on iceweasel (linux) - See FAQ.
Fix - Corrected a bug in scrapers happening with field names ending with a digit.
Fix - Corrected a problem in macros altering the settings in some rare cases.
Enhancement - Macros: changes were made in the dig conditions, for consistency in the MAU. The syntax is now the same as in the "select if" criteria.
Enhancement - The kernel is now compatible with Firefox 4 b8.
Enhancement - Small changes and enhancements were made in macros.
Enhancement - Some significant changes and enhancements were made in scrapers.
Enhancement - A checkbox was added to the catch section of the macro editor, to empty the catch without saving it.
Enhancement - Most of the useless multiple log messages have been removed.
Enhancement - Query generation matrices: added a few functions to the time variables.
Enhancement - Added the possibility to select the whole content of the address bar when pressing the esc key.
* Enhancement - Remaining graphic interface glitches solved for Firefox 4 beta.
- Feature - The query sorting (click on column header) now has 3 states: ascending, descending and user defined.
- Fix - The 'sort by' and 'limit' functions are now working in macros.
- Fix - The 'send to catch function' was not working in some cases from the images view. This was fixed.
- Fix - Sending data to catch with empty columns was removing last columns in certain cases.
- Fix - The page and images throbbers were spinning endlessly in some pages on Firefox 4.
- Enhancement - Removed the alert on redirection errors.
- Enhancement - When a processes is finished, the time of the end of execution is displayed in the info bar.
- Enhancement - The program recognizes more next page links.
- Enhancement - 'Select similar' and 'Select identical' functions were dramatically optimized.
- Fix - Several minor bugs were corrected in the scraper application. Please check that your previous scrapers are still behaving the same way.
- Fix - Firefox restart problems were corrected in the kernel compatibility check and update process.
- Fix - In rare cases, the numbering of columns was incorect in tables. The indexes now correspond to the column number, if no header is set for the column.
- Enhancement - Preferences are applied when the preferences window is closed. The OK and Cancel buttons now work in Windows.
- Enhancement - The 'Next Page' function was enhanced and now recognizes a larger number of links.
- Fix - The images view throbber was spinning forever when loading a file without images, a non-HTML document or a local file. This was fixed.
- Fix - A bug was corrected in the 'Invert Selection' function.
- Enhancement - The Chinese locale was removed for now. It was too partial to be useable. (Spanish and German locales were kept for now although they are not fully completed. In version 1.0, only fully translated locales will be available.)
- Enhancement - More separators are recognized in the list of labels field in the Scraper Editor. Labels can be separated by: ,;:/|# or TAB. (The TAB separation allows to simply copy and paste headers from a table of the page.)
- Enhancement - The update and compatibility check processes were enhanced.
- Enhancement - More minor fixes and enhancements were added to this version.
- Feature - Now works with the latest beta of Firefox 4.
- Feature - A first implementation of export to HTML was added. More layout flexibility will be added later.
- Feature - The Words view now finds frequent groups of words. (This feature is being tested and will probably not remain as is in the following versions.)
- Feature - A preference was added to choose whether the program should load or edit URLs on double click in datasheets
- Feature - The content of the Text view can now be moved to catch or exported.
- Fix - Send to catch on page load bug was fixed in data views.
- Fix - A bug was corrected that could offset cells when moved to the catch from tables.
- Fix - Some complex pages could generate Unresponsive Script alerts. This should not happen anymore.
- Fix - Fixed a bug opening catch files with spaces in the name.
- Fix - A data type was added in the SQL export, for broader compatibility.
- Enhancement - Queries were sometime sent too fast in 'fast scraping' mode for some sites. The frequency was reduced. (A preference will soon allow direct control over this parameter.)
- Enhancement - The management and execution of automators was enhanced in several parts of the code.
- Enhancement - The editing, moving and duplicating of query directory content was enhanced.
- Enhancement - Many other small fixes and enhancements were made.
- Feature - When opening an XML file, the program now recognizes if it is an OutWit automator and imports it directly to the user database.
- Feature - Select If function was not working on 'equal to' and 'not equal to' with string. this was corrected.
- Feature - It is now possible to split a scraped field into several fields with the new separator features of the scrapers (pro version).
- Fix - Send to catch on page load bug was fixed in several views.
- Known Issue - The 'send to catch on page load' function is still not working in data views. (Corrected in the coming update.)
- Fix - Export was not working for some extractors when used in macros. This was corrected.
- Fix - Insert rows in an empty datasheet is now working properly.
- Fix - Downloaded files with hexa encoded characters in the filename are now saved with the proper name on the disk.
- Fix - Dig and browse were not working properly on some pages, creating duplicated catch data, and not downloading files. This is fixed. Please use the bug report form if you find problematic web sites.
- Enhancement - Improved loading time of some web pages.
- Enhancement - Next and Up buttons now show their URL in the status bar.
- Enhancement - New enhancements were made to the scraper application function. They might alter the way old scrapers will behave. Please test thoroughly.
- Enhancement - When exporting a selection of rows to Excel, if labels are identical in all rows, the header will be set to that label in the exported file.
- Enhancement - In queries the program now recognizes if a query is a matrix when dragged and after it has been edited.
- Enhancement - Registration system was updated.
- Enhancement - Images extraction and dynamic node processing was enhanced for large AJAX pages.
- Enhancement - Online help is now... online.
- Fix - A bug was corrected that prevented exports from macros on Windows.
- Fix - A bug was corrected that made the Catch export from a macro generate a new column for each value.
- Fix - A few glitches were addressed in the saving of jobs when changes are made in the manager or in a macro.
- Fix - A refresh problem was solved in the last/next execution field of the job editor.
- Fix - Deleting the last lines of a table now removes its values from the detail panel or the queries editor.
- Enhancement - A finer access to timeout setting was added in the preferences panel.
- Enhancement - The name of the Jobs view is set to bold when a job is scheduled and active.
Feature - Pro version features were added to this version for beta test purposes. These features include new views (queries, macros, jobs, documents), as well as new features in the previous views (replace in scrapers, adjacent, limit...). Please check the inline help for more info on these.
Feature - A serial number must be entered to access these features. Pre-registration is open. We will send a temporary key to users who wish to beta test the version, as soon as the test program starts.
Feature - An info bar was added on top of the main panel with info on processes in progress and on the content of the current view.
Feature - The info bar can be moved up and down to reveal or hide a log of the program actions. The number of messages is limited at this point, but additional info will come in future versions.
Fix - A whole list of fixes and enhancements were made in this version.
* Enhancement - Some interface enhancements were made in the trees, including in the resizing of columns.
* Enhancement - Escape now first exits from full screen if needed and returns to the page view, then, pressing Escape a second time stops any running process.
* Enhancement - Minor enhancements in the update system. In particular, when several outfits are updated, the program doesn't open one tab per outfit any more.
* Fix - In some cases, the Stop button was not activated in dig or browse processes. This was solved.
* Fix - A bug made the email extractor return is some cases results containing %20. This was corrected.
* Known Issue - Since version 0.8.9.192, scraper application can take a little longer in some cases. On very large pages, it can even generate a "Script not Responding" alert. We are working on it.
Enhancement - Search of links is now also performed in XML files.
Enhancement - Guess is now also performed on non-HTML files.
Fix - Minor corrections in update system.
Feature - It is now possible to select the number of pages to browse automatically in the Browse button popup menu.
Feature - It is now possible to constrain the URLs to dig, using the Dig button popup menu (within domain, outside domain or all links).
Feature - Added the much awaited 'Text' column in the links view. (Yes we should have done it a while ago).
Feature - Added a bottom panel to the Text view, in order to move text content automatically to the catch.
Enhancement - The online help is finally usable. All program views are covered (including the views of the Pro version that are not yet available in this beta). Menu items will follow.
Enhancement - Optimized destination folder management: Now defaults to an 'outwit' subfolder in the folder selected in Firefox (usually, the system's 'download' folder). Can be altered in 'preferences' or in 'export' and 'download' functions.
Enhancement - Optimized interface elements and help content which reduced the overall size of the package.
Enhancement - Slightly optimized the link search module which did some redondant checks in the last versions.
Fix - Corrected a regression with the 'Adjacent' option in images, which prevented the generation of thumbnails in subsequent pages.
Fix - The path to the save and download folder is now stored in UTF-8 which makes it usable in non-latin languages.
Fix - Corrected a regression on the automatic sorting of images.
Fix - Corrected a fatal error when loading the Hub with a corrupted catch file. The program should now repair the catch file automatically if it happens.
Enhancement - Operations are faster on large catch files.
Enhancement - On some Ajax pages, the content was not re-analyzed when dynamically changed after a click. This was addressed.
Fix - The Dig button is operational again.
Feature - Through this 'Insert line(s)' feature, it is now possible to insert generated sequences of strings in most views (links, images, emails, tables, lists, guess and scraper). This can be used to automatically create incremented strings, URLs to explore, sequential file names to download, etc.. In the current version, this feature is limited to 15 items.
Feature - A new checkbox in the bottom panel of the 'Images' view allows to ask the program to look for possible neighboring images in series on the same server. In the current version, the program only searches for directly adjacent images in sequences.
Enhancement - Some images were filtered out if the aspect ratio of the high resolution image was different from the one of the thumbnail. The reason was that it usually prevented from getting images for which the full resolution was unavailable. However, this also prevented from finding the high res images in many galleries. This filter was removed.
Enhancement - When openning a Catch file, the program now asks if the user wants to add to the existing items of the catch or replace them.
Fix - Bookmarks and History are now accessible through the Navigation menu on Macintosh.
Fix - Several minor bugs and locale issues were fixed.
Feature - Added access to the history of visited pages in the Navigation menu.
Feature - Added possibility to remember the state of the Save Incoming Files mode between sessions.
Feature - Now keeps path of the destination folder in preferences.
Fix - Corrected a bug that sometimes occurred when importing a scraper.
Enhancement - Some Kernel code refactoring and commenting.
Enhancement - Finds more high-resolution images in galleries.
Enhancement - Finds images in html files saved on the hard disk.
Enhancement - Finds images in simple text files.
Fix - Some more bugs were fixed in the scraper application.
Enhancement - Now compatible with Firefox 3.5.
Corrected bug in the application of scrapers which prevented the application in some cases of incorrect character encoding declaration.
Added advanced selection features in the bottom panel of each view. The extracted data can now be filtered using criteria such as contains, begins with, greater than...
Refactoring of the code and addition of image management features, for the coming release of OutWit Images.