现在的位置: 首页 > 综合 > 正文

Amazon Campus(2013-Sep-24)Question 2 / 2 (Amazon Campus(17):Find the differences of items in amazon)

2018年02月18日 ⁄ 综合 ⁄ 共 2151字 ⁄ 字号 评论关闭
Question 2 / 2 (Amazon Campus(17):Find the differences of items in amazon)

Amazon has millions of different items in different categories right now, so when sellers want to sell items in our website, sellers want to find the right categories their items belong to.  Suppose we want to build a system to help sellers find the minimal
differences items and then find the right category. The difference index is a number that sum of single-word edits (insertion, deletion, substitution) required to change one phrase into the other:
For example, we get two lines from standard input Hadoop in practice
Hadoop operations
The difference index  of ‘Hadoop in practice’ and ‘Hadoop operations’ is 2. Because we can remove ‘practice’ and substitute ‘in’ with ‘operations’, then ‘Hadoop in practice’ can convert to ‘Hadoop operations’

For example, we get two lines from standard input
Hadoop cookbook
Hadoop operations
The difference index of ‘Hadoop cookbook’ and ‘Hadoop operations’ is 1. Because we can substitute ‘cookbook’ with ‘operations’ then convert 'Hadoop cookbook' can convert to 'Hadoop operations'

For example, we get two lines from standard input:
Ruby in action
Hadoop operations
The difference index of ‘Ruby in action’ and ‘Hadoop operations’ is 3. Because we can substitute ‘Ruby’ with ‘Hadoop’, ‘in’ with ‘operations’ and remove ‘action’ then 'Ruby in action' can convert to 'Hadoop operations'

测试案例:
"Ruby in action","Hadoop operations" ----3
"Hadoop in practice","Hadoop operations" ----2
"Hadoop cookbook","Hadoop operations" ----1
"Kindle Fire HD Tablet","Kindle Fire HD 8.9\" 4G LTE Wireless Tablet" ----4

首先查找二者匹配的单词数maxMatch,再用二者单词数较多的减去maxMatch即可。因为不匹配的单词采用insertion,deletion或substitution中的某种方式一步即可完成。代码如下:

    static int differ(String string1, String string2) {
    	int res = 0;
    	int maxMatch = 0;
    	if(string1.equals(string2)) return res;
    	String []words1 = string1.split(" ");
    	String []words2 = string2.split(" ");
    	
    	List<String> words1List = Arrays.asList(words1);
    	List<String> words2List = Arrays.asList(words2);
    	
    	for(int k=0; k<words2List.size(); k++) {
    		int match = 0;
	    	for(int i=k; i<words2List.size(); i++) {
	    		String word2 = words2List.get(i);
	    		
	    		for(int j=0; j<words1List.size(); j++) {
	    			String word1 = words1List.get(j);
	    			if(word1.equals(word2)) {
	    				match++;
	    				break;
	    			} else {
	    			}
	    		}
	    	}
	    	if(match > maxMatch) maxMatch = match;
    	}
    	if(words1List.size() > words2List.size()) {
    		res += words1List.size() - maxMatch;
    	} else {
    		res += words2List.size() - maxMatch;
    	}
    	return res;
    }

内牛满面,其实是最长公共子序列问题,采用DP的方法去百度之。

抱歉!评论已关闭.