Should Nokogiri replace REXML?

  • Permalink
  • submit to reddit
  • Email
  • Follow


This is probably more suited to the ruby-core mailing list, but as I
am not following that list regularly any longer I'll bring it up here.

REXML is starting to look pretty dated. I find myself no longer using
REXML for anything and use Nokogiri instead. I suspect other
developers are doing the same. Aaron (and Mike) have done an
incredible job with Nokogiri. And so I think it's not unreasonable to
suggest that it replace REXML as a standard library.

The only downside I see that Nokogiri is not a pure Ruby library, but
depends on libxml2. But given the advantages, speed, and uptake of
Nokogiri, I would not expect that to be any sort of show-stopper.

I have always thought a good XML library was important to Ruby. I kept
libxml-ruby on life support for many years hoping someone would
eventually come along and carry on development (it was the best I
could do not being a C coder). That did happen eventutally and we can
thank Dan and Charlie for all their hard work for making libxml-ruby
an excellent library, and of course we should thank Sean who started
the project.

But Aaron came along and upped the ante with Nokogiri.

So any way. I've had this thought in the back of my mind for a while,
and just wanted to put it out there.

0
Reply Intransition 1/21/2010 4:32:09 PM

See related articles to this posting


Wouldn't that make it really hard for the non C-based Ruby implementations?

On Thu, Jan 21, 2010 at 11:32 AM, Intransition <transfire@gmail.com> wrote:
> This is probably more suited to the ruby-core mailing list, but as I
> am not following that list regularly any longer I'll bring it up here.
>
> REXML is starting to look pretty dated. I find myself no longer using
> REXML for anything and use Nokogiri instead. I suspect other
> developers are doing the same. Aaron (and Mike) have done an
> incredible job with Nokogiri. And so I think it's not unreasonable to
> suggest that it replace REXML as a standard library.
>
> The only downside I see that Nokogiri is not a pure Ruby library, but
> depends on libxml2. But given the advantages, speed, and uptake of
> Nokogiri, I would not expect that to be any sort of show-stopper.
>
> I have always thought a good XML library was important to Ruby. I kept
> libxml-ruby on life support for many years hoping someone would
> eventually come along and carry on development (it was the best I
> could do not being a C coder). That did happen eventutally and we can
> thank Dan and Charlie for all their hard work for making libxml-ruby
> an excellent library, and of course we should thank Sean who started
> the project.
>
> But Aaron came along and upped the ante with Nokogiri.
>
> So any way. I've had this thought in the back of my mind for a while,
> and just wanted to put it out there.
>
>

0
Reply Jordi 1/21/2010 7:17:01 PM

On Jan 21, 4:17=A0pm, Jordi Bunster <jo...@bunster.org> wrote:
> Wouldn't that make it really hard for the non C-based Ruby implementation=
s?

Well, Nokogiri was ported to JRuby using FFI. Which means can work on
MacRuby and MagLev possibly.

http://github.com/tenderlove/nokogiri/tree/java

Dunno it's status but seems pretty doable.

--
Luis Lavena
0
Reply Luis 1/21/2010 8:29:38 PM

On Thu, Jan 21, 2010 at 1:17 PM, Jordi Bunster <jordi@bunster.org> wrote:
> Wouldn't that make it really hard for the non C-based Ruby implementations?

Yes, it would. But if someone wants to help implement the remaining
bits of the pure-Java Nokogiri, we'll be pretty close in JRuby.

http://www.serabe.com/2009/12/31/helping-nokogiri-take-ii/

Unfortunately libxml encompasses a *lot* of functionality not
typically included in the many Java XML parser (like bad HTML
scrubbing), so including all of Nokogiri would introduce a lot of
dependencies. Ideally I'd like to see a Nokogiri "lite" that just
provides the W3C APIs for DOM, SAX, and pull parsing, and allows you
to pull in "Nokogiri HTML" or some other library for doing HTML
scrubbing.

- Charlie

0
Reply Charles 1/21/2010 8:35:11 PM

On Thu, Jan 21, 2010 at 2:31 PM, Luis Lavena <luislavena@gmail.com> wrote:
> On Jan 21, 4:17=C2=A0pm, Jordi Bunster <jo...@bunster.org> wrote:
>> Wouldn't that make it really hard for the non C-based Ruby implementatio=
ns?
>
> Well, Nokogiri was ported to JRuby using FFI. Which means can work on
> MacRuby and MagLev possibly.
>
> http://github.com/tenderlove/nokogiri/tree/java
>
> Dunno it's status but seems pretty doable.

FFI is great, but it's only usable where you have libxml available (a
problem for the C ext as well) and where you are allowed to use it (a
big problem for JRuby users on Google App Engine, Android, or secure
environments where native libraries are forbidden).

The only perfect solution for JRuby is a pure-Java Nokogiri.

- Charlie

0
Reply Charles 1/21/2010 8:43:35 PM
comp.lang.ruby 48858 articles. 5 followers. Post

4 Replies
305 Views

Similar Articles

[PageSpeed] 2


  • Permalink
  • submit to reddit
  • Email
  • Follow


Reply:

Similar Artilces:

rexml: replace several text nodes by a single one
Hi, this sounds simple but I just don't get it... I want to replace the text child nodes of an REXML::Element by a new, single text node... In order to do that, I want to remove any previous text nodes first --------------------------------- require 'rexml/document' include REXML doc = Document.new("<a> <b/> </a>") a = doc.elements['a'] a.delete_element('b') # assert: a.texts.size == 2 # try #1 # # a.texts.clear # --> can't modify frozen array (TypeError) # try #2 # # iterate all elements, call delete_element: fails, be...

Javascrip Replace not Replacing
I have the following function: function updaterec(i, callid){ var theExp = new RegExp(callid, 'gi'); var x = document.getElementById(i.id.replace(RegExp,'tb_linechage')); x.value = 'Y';} I call this function with: updaterec(this, 'tb_assigneddt'); // this = dl_inspectionlist__ctl1_tb_assigneddt I want to update dl_inspectionlist__ctl1_tb_linechage but dl_inspectionlist__ctl1_tb_assigneddt keeps updating itself instead. What am I missing? Any help is greatly appreciated. Thanks, Brad "Brad" <milbrand@gmail.com> writes: > var theExp = n...

what does this.value.replace().replace() do?
Hello, Amrit has posted the code below for keeping the focus on a text box until some text is entered. Can someone please explain how {if(this.value.replace(/^\s+/,"").replace(/\s+$/,"") == "") focus();} works - I don't understand the this.value.replace().replace() part.... Thanks Geoff function focus() { if(typeof document.MyForm.username.onblur != "function") document.MyForm.username.onblur = function() {if(this.value.replace(/^\s+/,"").replace(/\s+$/,"") == "") focus();} alert("please add y...

Replace not spotting a replacement
Hi Folks, Seems the replace function is being a bit dim here. This fails: Sqrt[2 \[Pi]] \[Tau]^(3/2) /. Sqrt[ 2 \[Pi] \[Tau]] -> \[Mu] Apologies if this is an FAQ - I couldn't see anything about it. What do I need to do here to coerce Mathematica into making this replacement? I tried using the Collect function on tau first, but that didn't seem to work. I also tried re-writing the replacement rule as Sqrt[2 \[Pi]] \[Tau]^(3/2) /. Sqrt[ 2 \[Pi] ] t^(1/2) -> \[Mu] But still no joy. Anyone know? Many thanks, David. Hi, David, try this: In[12]...

Newbie question on replace-eval-replacement , replace-quote
While searching in groups, I came across the following discussion. http://groups.google.com/group/gnu.emacs.help/browse_thread/thread/c723bde021ea8183/a2afe24ce98ed739?lnk=gst&q=%22\%2C%22+swap#a2afe24ce98ed739 which presents the following code. ,---- | (defun query-swap-regexp (regexp-a regexp-b) | "Swap A and B regexp matches in current buffer or region." | (interactive "sRegexp A: \nsRegexp B: ") | (let ((match-a (save-excursion | (re-search-forward regexp-a nil t) | (match-string 0))) | (match-b (save-excursion | ...

gsfmode=replace does not replace
I am using the goption gsfmode=replace and I am expecting that if you use name="xxx" on the ganno statement then when you come to run it a second time it will "replace" what it wrote before and reuse the name "xxx". But it does not. This is true for sas v8.2 and sas v9.1.3 . What is wrong? How can I get it to reuse the name I have defined to name= on "proc ganno"? On Mar 6, 8:50=A0pm, RolandRB <rolandbe...@hotmail.com> wrote: > I am using the goption gsfmode=3Dreplace and I am expecting that if you > use name=3D"xxx" on the ganno s...

rexml error
Hi all, >ruby -v ruby 1.8.2 (2004-10-11) [sparc-solaris2.9] This bit of code, that seemed to work fine on 1.8.1, is now choking: require "rexml/document" include REXML file = "some_file" database = "foo" doc = Document.new(File.new(file)) ary = doc.elements.to_a("//name[text()='#{database}']") parent = ary[0].parent Running this bit of code, I get: opt/lib/ruby/site_ruby/1.8/rexml/parsers/treeparser.rb:80:in `parse': uninitialized constant REXML::Validation (NameError) from /opt/lib/ruby/site_ruby/1.8/rexml/document.rb:175:i...

REXml
Hello, Please, could you tell me what is the best documentation for REXml, or the best tutorial. I've found an article from IBM, but it'snt enough. Thanks Johan wrote: > Hello, > > Please, could you tell me what is the best documentation for REXml, or > the best tutorial. I've found an article from IBM, but it'snt enough. There is a tutorial on the REXML home page http://www.germane-software.com/software/rexml/docs/tutorial.html You can also find some info at www.rubyxml.com James ...

How to count the number of replacements w/ replace?
Hi, Using string.replace with regular expressions, is there any way to count the number of replacements that actually happens? I know you can limit the number of replacements w/ a count value, but how can one count the actual number of times that an expression gets replaced? Thanks, Sally Sally B. wrote on 26 mrt 2005 in comp.lang.javascript: > Using string.replace with regular expressions, is there any way to > count the number of replacements that actually happens? I know you can > limit the number of replacements w/ a count value, but how can one > count the actual number of ...

help with string replace
Hello I wish to replace all the characters in a string except those which are inside '<' & '>' characters. And there could be multiple occurences of < & > within the string. e.g. string = "this is an example of <how> many words could be hidden <under> these characters" now, from this string all the characters should be searched & replaced by a certain logic except <how> & <under> I am accepting the string from a textarea form field and there can be no or multiple occurences of words within < > Any help wil...

CDE replaced with Java after home replacement
I have a problem with CDE after replacing the /export/home disk on my Solaris 10 (Sparc) system. I had some bad blocks on the /export/home drive. As root I went to /export/home and tar'd up everything, installed a new drive and used tar to restore everything. Now, when I login using my user account, I have the Java desktop instead of CDE. I have a second account with the Java desktop as the default, and when I login to that account I still get the Java desktop. The root account, which uses CDE on / not /export/home, was not effected. What would cause this? Are there some special fil...

Nested replacements with query-replace-regexp
In a HTML document, I have tags that may have too many spaces between attributes, may be spread over multiple lines or both. Such as: <img src=3D"v_dinancf_dinan2.jpg" height=3D"90"> <a href=3D"images/visite_dinancf_dinan3.jpg" target=3D"new"> <img src=3D"v_dinancf_dinan3.jpg" alt=3D"Cliquez sur l=ECmage pour la voir grandeur nature" height=3D"160"> I want to use query-replace-regexp to make them appear such as: <img ...

s/$match/$replace/ fails when $replace has backreferences
I need to store my match and replace strings in variables. This fails when my match string uses back references to groupings. Here for example I try to replace "foo" with "foobar" using a back reference: #!/usr/bin/perl $content = "this is my foo"; $match = "(foo)"; $replace = "\1bar"; #$replace = "\\1bar"; # this doesn't work either #$replace = "$1bar"; # neither does this $content =~ s/$match/$replace/; print "$content\n" Any ideas how to get this to work? I need the match and replace strings in variab...

replacing..
Hi, String s = "hello in the (?hell?)"; s = s.replaceAll("?", "\""); i hav a string s as shown above i want to replace ? with " the output should be like hello in the ("hell") i tried the using replaceAll replaceAll("(?", "\""); but i am getting an error like java.util.regex.PatternSyntaxException: Dangling meta character '?' near index 0 can anyone suggest me how to go ahead with this. Regards Raj raj wrote: > Hi, > > String s = "hello in the (?hell?)"; > s = s.replaceAll("...

replacement
Hi All, I have a file for which each line has a substring monthday 13041140 Jun28 gh 130164140 Jun29 ad 130114540 Jun30 gg 1301140 Aug6 fty 1301140 Aug4 yy ..... I want to convert "monthDay" part into "month-day" format, so that the last line will become 1301140 Aug-4 yy I know that there are lots of way of doing this but do you know how can I do this using replaceall function? Thanks a lot. Hi, sara schrieb: > Hi All, > > I have a file for which each line has a substring monthday > 13041140 Jun28 gh > 130164140 Jun29 ad > 130114540 Jun30 gg > 1...

Replace
Hi How do you in php replace parts of a string? $string="asdsaadsjijirjfrij"; replace "asdsa" with "kdkd" in $string Thanks ! M On 27 Jan 2005 15:20:13 -0800, asdfkajsdflkjsadlfkjoewqifoeiwjf@yahoo.com wrote: >How do you in php replace parts of a string? > >$string="asdsaadsjijirjfrij"; >replace "asdsa" with "kdkd" in $string http://php.net/manual-lookup.php?pattern=replace -- Andy Hassall / <andy@andyh.co.uk> / <http://www.andyh.co.uk> <http://www.andyhsoftware.co.uk/space> Space: disk usage an...

Replace all
Hi All Please tell me how nested statements are handled in java. Here is an example: String st="abcdefghijk"; st=st.replaceAll("a", "x").replaceAll("b", "y").replaceAll("c", "z"); The expected result is xyzdefghijk. 2nd way: st=st.replaceAll("a", "x"); st=st.replaceAll("b", "y"); st=st.replaceAll("c", "z"); I have 2 questions: a) the first one doesn't work for complicated expressions: example replaceAll("&abc", "") given as argume...

replace
i have an array data, like this 1234,4565,7890,3478 i use replace to replace "," to ";" for fitting into the data input however, i choose record according to the "selected" checkbox, then the text like this 1234,,,3478 how can i change 3 commas ",,," into only 1 single ";"? Thanks a lot. tony Tony WONG wrote: > i have an array data, like this > > 1234,4565,7890,3478 > > i use replace to replace "," to ";" for fitting into the data input > > however, i choose record according to the "s...

Regex Replacement: Replacing text with an empty string
I'm trying to clean up some comments in web pages. I'm using regexes to doa lot of the work, but I've run into a problem. Toward the end of theprocess, I'm trying to replace any remaining HTML tags with an emptystring, as in no spaces, nothing, just "". If I replace the HTML tags witha space or other characters it works, but it won't work with an emptystring. (I also tried at mindprod.com, one of the first places for Javainfo, but the site is down.)Here's a snippet to explain what I'm doing://sDesc is the string with the text I'm working on ...

dired-do-query-replace-regex, to replace ALL files
Emacs suggestion: when in dired mode in the processing of doing dired-do-query-replace- regexp (shortcut Q), it offers the ability to do replacement without asking on the current file, by pressing the ! key, but it would be nice to have: (1) the ability to do ALL replacements on ALL files without further asking. (2) the ability to not do any replacement on the current file. I use dired-do-query-replace-regexp few times a week, typically on tens of files, but maybe once in a month i do it over hundreds or thuosands of files on a website. Often, after some replacement, it becomes obvious tha...

replacement
hi I have a matrix named A that I wanna change the elements greater than 200 to 999D and less than 200 to the element+D, my mean if an element is 20 I wanna change it to 20D. I tried some ways but I didn't get any success. Any help is highly appreciated "Ali Mirzaei" <mirzaee1802@gmail.com> wrote in message <i490on$411$1@fred.mathworks.com>... > hi > > I have a matrix named A that I wanna change the elements greater than 200 to 999D and less than 200 to the element+D, my mean if an element is 20 I wanna change it to 20D. I tried some ways but I did...

how to make replace function replace globally in a string
I was trying to use back-to-back replace functions to convert a url: str1 = str.replace("%2F","/").replace("%3F","?").replace("%3D","=").replace("%2 6","&"); It didn't replace all 4 types of strings. Then, I googled and found this suggestion of some JavaScript Tutorials, so I used replace with a regex with a global switch: str1 = str.replace(/%2F/g,"/").replace(/%3F/g,"?").replace(/%3D/g,"=").replace( /%26/g,"&"); and it did replace all the occurances of a...

Search & replace, but increment with every replace
Hi All, I have a file that contains 100's of such lines: insert into data (1,'test'); insert into data (1,'hj'); insert into data (1,'tejkdsst'); insert into data (1,'hjsd'); insert into data (1,'tdfgest'); insert into data (1,'dsffshj'); Now I want to replace all the 1's with a sequential number so that the final output would be like this: insert into data (1,'test'); insert into data (2,'hj'); insert into data (3,'tejkdsst'); insert into data (4,'hjsd'); insert into data (5,'tdfgest'); insert ...

Replacing the characters using replace at run time
Hi, I have a MaskedBox control which accepts telephone no. with the followin mask - (___)___-____ I have created a numeric keypad using command buttons for eg: 1, 2, 3 etc.. For eg: When i click on 1 (command button cmdOne), the first "_" character should be replaced with 1 I am trying to do it using the following code Private Sub cmdOne_Click() Dim strTemp As String Dim strFirstOccurence As Integer strTemp = mskboxTelNo.Text strFirstOccurence = (InStr(mskboxTelNo.Text, "_")) 'MsgBox strFirstOccurence strTemp = Replace(strTemp, "_", "1", strF...