www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - HTML Parsing lib

reply "Suliman" <evermind live.ru> writes:
I found only https://github.com/Bystroushaak/DHTMLParser

But I can't get it work:
C:\Users\Dima\Downloads\DHTMLParser-master\DHTMLParser-master>dmd 
find_links.d
OPTLINK (R) for Win32  Release 8.00.15
Copyright (C) Digital Mars 1989-2013  All rights reserved.
http://www.digitalmars.com/ctg/optlink.html
find_links.obj(find_links)
  Error 42: Symbol Undefined 
_D11dhtmlparser11parseStringFAyaZC11dhtmlparser11HTM
LElement
find_links.obj(find_links)
  Error 42: Symbol Undefined _D11dhtmlparser12__ModuleInfoZ
--- errorlevel 2

Is there any other HTML parsing lib, or maybe someone do know how 
to get it's work. Look like it's not compatible with current 
version of DMD
Oct 25 2014
next sibling parent reply "MrSmith" <mrsmith33 yandex.ru> writes:
On Saturday, 25 October 2014 at 19:44:25 UTC, Suliman wrote:
 I found only https://github.com/Bystroushaak/DHTMLParser

 But I can't get it work:
 C:\Users\Dima\Downloads\DHTMLParser-master\DHTMLParser-master>dmd 
 find_links.d
 OPTLINK (R) for Win32  Release 8.00.15
 Copyright (C) Digital Mars 1989-2013  All rights reserved.
 http://www.digitalmars.com/ctg/optlink.html
 find_links.obj(find_links)
  Error 42: Symbol Undefined 
 _D11dhtmlparser11parseStringFAyaZC11dhtmlparser11HTM
 LElement
 find_links.obj(find_links)
  Error 42: Symbol Undefined _D11dhtmlparser12__ModuleInfoZ
 --- errorlevel 2

 Is there any other HTML parsing lib, or maybe someone do know 
 how to get it's work. Look like it's not compatible with 
 current version of DMD
You need to pass a library to compiler as well (all its files or .lib/.a file) if it is compiled as static library
Oct 25 2014
parent reply "MrSmith" <mrsmith33 yandex.ru> writes:
On Saturday, 25 October 2014 at 19:46:01 UTC, MrSmith wrote:
 On Saturday, 25 October 2014 at 19:44:25 UTC, Suliman wrote:
 I found only https://github.com/Bystroushaak/DHTMLParser

 But I can't get it work:
 C:\Users\Dima\Downloads\DHTMLParser-master\DHTMLParser-master>dmd 
 find_links.d
 OPTLINK (R) for Win32  Release 8.00.15
 Copyright (C) Digital Mars 1989-2013  All rights reserved.
 http://www.digitalmars.com/ctg/optlink.html
 find_links.obj(find_links)
 Error 42: Symbol Undefined 
 _D11dhtmlparser11parseStringFAyaZC11dhtmlparser11HTM
 LElement
 find_links.obj(find_links)
 Error 42: Symbol Undefined _D11dhtmlparser12__ModuleInfoZ
 --- errorlevel 2

 Is there any other HTML parsing lib, or maybe someone do know 
 how to get it's work. Look like it's not compatible with 
 current version of DMD
You need to pass a library to compiler as well (all its files or .lib/.a file) if it is compiled as static library
You can try dmd find_links.d dhtmlparser.d quote_escaper.d
Oct 25 2014
parent reply "Suliman" <evermind live.ru> writes:
 You need to pass a library to compiler as well (all its files 
 or .lib/.a file) if it is compiled as static library
You can try dmd find_links.d dhtmlparser.d quote_escaper.d
C:\Users\Dima\Downloads\DHTMLParser-master\DHTMLParser-master>dmd find_links.d quote_escaper.d OPTLINK (R) for Win32 Release 8.00.15 Copyright (C) Digital Mars 1989-2013 All rights reserved. http://www.digitalmars.com/ctg/optlink.html find_links.obj(find_links) Error 42: Symbol Undefined _D11dhtmlparser11parseStringFAyaZC11dhtmlparser11HTM LElement find_links.obj(find_links) Error 42: Symbol Undefined _D11dhtmlparser12__ModuleInfoZ --- errorlevel 2
Oct 25 2014
parent reply "Suliman" <evermind live.ru> writes:
On Saturday, 25 October 2014 at 19:51:48 UTC, Suliman wrote:
 You need to pass a library to compiler as well (all its files 
 or .lib/.a file) if it is compiled as static library
You can try dmd find_links.d dhtmlparser.d quote_escaper.d
C:\Users\Dima\Downloads\DHTMLParser-master\DHTMLParser-master>dmd find_links.d quote_escaper.d OPTLINK (R) for Win32 Release 8.00.15 Copyright (C) Digital Mars 1989-2013 All rights reserved. http://www.digitalmars.com/ctg/optlink.html find_links.obj(find_links) Error 42: Symbol Undefined _D11dhtmlparser11parseStringFAyaZC11dhtmlparser11HTM LElement find_links.obj(find_links) Error 42: Symbol Undefined _D11dhtmlparser12__ModuleInfoZ --- errorlevel 2
Sorry I missed dhtmlparser.d
Oct 25 2014
parent reply "Suliman" <evermind live.ru> writes:
How I can build such App with DUB?
Oct 25 2014
parent reply "MrSmith" <mrsmith33 yandex.ru> writes:
On Saturday, 25 October 2014 at 19:55:10 UTC, Suliman wrote:
 How I can build such App with DUB?
Unfortunately that library has no dub package. But you can include it in your project. See info here http://code.dlang.org/package-format
Oct 25 2014
parent reply "Suliman" <evermind live.ru> writes:
 Unfortunately that library has no dub package.
 But you can include it in your project.

 See info here http://code.dlang.org/package-format
I can't understand how to set in dub that I need to to include in compilation process other files... Could you help me?
Oct 25 2014
parent "Chris" <wendlec tcd.ie> writes:
On Sunday, 26 October 2014 at 06:20:45 UTC, Suliman wrote:
 Unfortunately that library has no dub package.
 But you can include it in your project.

 See info here http://code.dlang.org/package-format
I can't understand how to set in dub that I need to to include in compilation process other files... Could you help me?
You can set up different configurations. Here's a template / example: { "name": "myproject", "description": "Something useful, hopefully.", "copyright": "Copyright © 2014, Me, myself and I", "authors": ["Me"], "homepage" : "http://www.me.com/", "dependencies": { }, "configurations": [ { "name": "config_1", "targetType": "executable", "platforms": ["linux"], "lflags": [ "-Llib/linux/64bit" [=> Here you tell the compiler where to find the libraries] ], "libs": [ "somelib", [=> Here you tell it which libraries to use] "morelibs", ], "excludedSourceFiles": ["source/dll/dllmain.d", "source/dll/mylib.d"], }, { "name": "config_2", "targetName": "mylib.dll", "targetType": "dynamicLibrary", "targetPath": "bin/windows/32bit/dll", "platforms": ["windows"], "lflags": [ "-Llib/windows/32bit", ], "libs": [ "somelib", ], "mainSourceFile": "source/dll/dllmain.d", "sourceFiles-windows-x86-dmd": ["source/dll/blah.d"], "excludedSourceFiles": ["source/app.d"], } ] }
Oct 28 2014
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
Another option for html is my dom.d

https://github.com/adamdruppe/arsd

get dom.d and characterencodings.d in your project directory.

compile with dmd yourfile.d dom.d characterencodings.d

here's an example:

import arsd.dom;

void main() {
    auto document = new Document();

    // The example document will be defined inline here
    // We could also load the string from a file with
    // std.file.readText or the web with std.net.curl.get
    document.parseGarbage(`<html><head>
      <meta name="author" content="Adam D. Ruppe">
      <title>Test Document</title>
    </head>
    <body>
      <p>This is the first paragraph of our <a
href="test.html">test document</a>.
      <p>This second paragraph also has a <a
href="test2.html">link</a>.
      <p id="custom-paragraph">Old text</p>
    </body>
    </html>`);

    import std.stdio;
    // retrieve and print some meta information
    writeln(document.title);
    writeln(document.getMeta("author"));
    // show a paragraph’s text
    writeln(document.requireSelector("p").innerText);
    // modify all links
    document["a[href]"].setValue("source", "your-site");
    // change some html
    document.requireElementById("custom-paragraph").innerHTML =
"New <b>HTML</b>!";
    // show the new document
    writeln(document.toString());
}




You can replace the html string with something like
std.file.readText("yourfile.html"); too


My library is meant to give an api similar to javascript.


I don't use dub so idk about how to use that, I just recommend
adding my files to your project if you wanna try it.
Oct 25 2014
prev sibling parent "yazd" <yazan.dabain gmail.com> writes:
On Saturday, 25 October 2014 at 19:44:25 UTC, Suliman wrote:
 I found only https://github.com/Bystroushaak/DHTMLParser

 But I can't get it work:
 C:\Users\Dima\Downloads\DHTMLParser-master\DHTMLParser-master>dmd 
 find_links.d
 OPTLINK (R) for Win32  Release 8.00.15
 Copyright (C) Digital Mars 1989-2013  All rights reserved.
 http://www.digitalmars.com/ctg/optlink.html
 find_links.obj(find_links)
  Error 42: Symbol Undefined 
 _D11dhtmlparser11parseStringFAyaZC11dhtmlparser11HTM
 LElement
 find_links.obj(find_links)
  Error 42: Symbol Undefined _D11dhtmlparser12__ModuleInfoZ
 --- errorlevel 2

 Is there any other HTML parsing lib, or maybe someone do know 
 how to get it's work. Look like it's not compatible with 
 current version of DMD
You can try https://github.com/bakkdoor/gumbo-d
Oct 26 2014