module Kramdown::Parser::Html::Parser
Contains the parsing methods. This module can be mixed into any parser to get HTML parsing functionality. The only thing that must be provided by the class are instance variable @stack for storing the needed state and @src (instance of StringScanner) for the actual parsing.
Public Instance Methods
Process the HTML start tag that has already be scanned/checked via @src.
Does the common processing steps and then yields to the caller for further processing (first parameter is the created element; the second parameter is true
if the HTML element is already closed, ie. contains no body; the third parameter specifies whether the body - and the end tag - need to be handled in case closed=false).
# File lib/kramdown/parser/html.rb 86 def handle_html_start_tag(line = nil) # :yields: el, closed, handle_body 87 name = @src[1] 88 name.downcase! if HTML_ELEMENT[name.downcase] 89 closed = !@src[4].nil? 90 attrs = parse_html_attributes(@src[2], line, HTML_ELEMENT[name]) 91 92 el = Element.new(:html_element, name, attrs, category: :block) 93 el.options[:location] = line if line 94 @tree.children << el 95 96 if !closed && HTML_ELEMENTS_WITHOUT_BODY.include?(el.value) 97 closed = true 98 end 99 if name == 'script' || name == 'style' 100 handle_raw_html_tag(name) 101 yield(el, false, false) 102 else 103 yield(el, closed, true) 104 end 105 end
Handle the raw HTML tag at the current position.
# File lib/kramdown/parser/html.rb 126 def handle_raw_html_tag(name) 127 curpos = @src.pos 128 if @src.scan_until(/(?=<\/#{name}\s*>)/mi) 129 add_text(extract_string(curpos...@src.pos, @src), @tree.children.last, :raw) 130 @src.scan(HTML_TAG_CLOSE_RE) 131 else 132 add_text(@src.rest, @tree.children.last, :raw) 133 @src.terminate 134 warning("Found no end tag for '#{name}' - auto-closing it") 135 end 136 end
Parses the given string for HTML attributes and returns the resulting hash.
If the optional line
parameter is supplied, it is used in warning messages.
If the optional in_html_tag
parameter is set to false
, attributes are not modified to contain only lowercase letters.
# File lib/kramdown/parser/html.rb 113 def parse_html_attributes(str, line = nil, in_html_tag = true) 114 attrs = {} 115 str.scan(HTML_ATTRIBUTE_RE).each do |attr, val, _sep, quoted_val| 116 attr.downcase! if in_html_tag 117 if attrs.key?(attr) 118 warning("Duplicate HTML attribute '#{attr}' on line #{line || '?'} - overwriting previous one") 119 end 120 attrs[attr] = val || quoted_val || "" 121 end 122 attrs 123 end
Parse raw HTML from the current source position, storing the found elements in el
. Parsing continues until one of the following criteria are fulfilled:
-
The end of the document is reached.
-
The matching end tag for the element
el
is found (only used ifel
is an HTML element).
When an HTML start tag is found, processing is deferred to handle_html_start_tag
, providing the block given to this method.
# File lib/kramdown/parser/html.rb 149 def parse_raw_html(el, &block) 150 @stack.push(@tree) 151 @tree = el 152 153 done = false 154 while !@src.eos? && !done 155 if (result = @src.scan_until(HTML_RAW_START)) 156 add_text(result, @tree, :text) 157 line = @src.current_line_number 158 if (result = @src.scan(HTML_COMMENT_RE)) 159 @tree.children << Element.new(:xml_comment, result, nil, category: :block, location: line) 160 elsif (result = @src.scan(HTML_INSTRUCTION_RE)) 161 @tree.children << Element.new(:xml_pi, result, nil, category: :block, location: line) 162 elsif @src.scan(HTML_TAG_RE) 163 if method(:handle_html_start_tag).arity.abs >= 1 164 handle_html_start_tag(line, &block) 165 else 166 handle_html_start_tag(&block) # DEPRECATED: method needs to accept line number in 2.0 167 end 168 elsif @src.scan(HTML_TAG_CLOSE_RE) 169 if @tree.value == (HTML_ELEMENT[@tree.value] ? @src[1].downcase : @src[1]) 170 done = true 171 else 172 add_text(@src.matched, @tree, :text) 173 warning("Found invalidly used HTML closing tag for '#{@src[1]}' on " \ 174 "line #{line} - ignoring it") 175 end 176 else 177 add_text(@src.getch, @tree, :text) 178 end 179 else 180 add_text(@src.rest, @tree, :text) 181 @src.terminate 182 if @tree.type == :html_element 183 warning("Found no end tag for '#{@tree.value}' on line " \ 184 "#{@tree.options[:location]} - auto-closing it") 185 end 186 done = true 187 end 188 end 189 190 @tree = @stack.pop 191 end