Clone an undirected graph return deep copy
Given a reference of a node in a connected undirected graph, return a deep copy (clone) of the graph. Each node in the graph contains a val (int) and a list (List[Node]) of its neighbors. https://leetcode.com/problems/clonegraph/
Trying to solve this problem off of leetcode and not sure exactly where I'm stuck and how to get to the correct solution
Example:
class Solution(object):
def cloneGraph(self, node):
newNode = Node(node.val, [])
oldNode = node
dic = {}
dic[oldNode] = newNode
bfs = [oldNode]
while bfs:
oldNode = bfs.pop(0)
newNode = dic[oldNode]
neighbors = []
for item in oldNode.neighbors:
if item not in dic:
copy = Node(item.val, [])
dic[item] = copy
neighbors.append(copy)
bfs.append(item)
else:
copy = dic[item]
neighbors.append(copy)
newNode.neighbors = neighbors
return newNode
See also questions close to this topic

what is time complexity of these python methods
what is the time complexity of
subgraph()
andnetworkx.connected_components()
. is the time complexity ofsubgraph()
O(V+E)? 
Hotencoded values & DataFrame for logistic regression
I am reaching out regarding a small challenge I am facing. I am basically trying to run a logistic regression on a dataset that has features with some categorical values. In order to process those features through the regression, I was planning to encode them
#Select categorical features only & encode name numerically with LabelEncoder cat_features = df.select_dtypes(include=[object]) label_enc = preprocessing.LabelEncoder() le_features = cat_features.apply(label_enc.fit_transform) #Aggregate all encoded values into a binary matrix enc = preprocessing.OneHotEncoder() enc.fit(le_features) final_cat_features = enc.transform(le_features).toarray()
After running this code, I do confirm it returns an encoded matrix
(4665, 290) <class 'numpy.ndarray'>
This is where I get stuck. How I am supposed to regenerate a dataframe from that exactly?! Should I concatenate the 290 columns together in order to end up with a new feature to add to my new dataframe? If not, I have to say I am stuck here.
Thanks for the help

np.linalg.svd() and/or np.linalg.eig() taking a lot of time in computation on a low dimension matrix for PCA (Principal Component Analysis)
I have Image data, and whenever I use 32x32 size image or 32x48 (some close sizes), I am able to perform PCA on it. However when I tried working on 128x128 sized image. My PC just got stuck and for hours it didn't work. My PC Specs are decent enough (16GB DDR4 ram2400Mhz i77th gen 7700HQ and 1050Ti 4 GB GPU).
Still its stuck and doesn't proceed ahead. It comsumes a large chunk of my RAM yet struggles to complete. I added print statements in order to track out the progress and according to my analysis , it is unable to compute linalg.svd() or linalg.eig() [whether I use numpy or scipy]. But in the case of 32x32 image , it works so fast.
def pca(X): m , n = X.shape covariance_matrix = 1/m * np.dot(X,X.T) print("COVARIANCE MAT CALCULATED !") [U, S, V] = np.linalg.svd(covariance_matrix) #eig_vals, eig_vecs = np.linalg.eig(covariance_matrix) # Tried this too but same outcome, just hang #print("EIGEN VALS AND VEC CALCULATED !") print("U S V CALCULATED !") return [U , S]
Eg. for a 32x32 image, I sent in the data as (1024,1) vector and the covariance matrix has the dimensions of (1024,1024). On this dimension or in other words on nxn matrix I am running the svd/eig. I am unable to do so for slight bigger images(128x128 for eg) and I am not getting any heads up why is it so.
Kindly share your valuable suggestions and approaches/tweeks/techniques I should try out to make this computation memory and time efficient. Also I would like to know, how I can let my GPU handle the complex mathematical computations (Ubuntu 18.04)

Does mysql use BFS or DFS?
Does Mysql store the data in a data graph structure? Let's say I:
insert into table (name) values ("Peter")
. Is Peter stored in a graph? If that's the case and I do:select from table where name="peter"
how is Peter found? Does it use breadth first search or depth first search?The thing is that I'm studying BFS and DFS but...I was wondering what the point in learning it was if it is far easier saving data to a database and just accessing data.

Breadth First Search implementing to vector of Linked List
I have the code for the graph in c++:
My Graph header file:
#include <string> #include <iostream> #include <map> #include <vector> using namespace std; struct vertex { string code; vertex* next; }; struct AdjList { vertex *head; AdjList(vertex* Given) { head = Given; } }; class Graph { map<string, string> associations; int nodeNum; //amount of nodes or size of the graph; vector<AdjList> adjList; public: Graph(int NodeNum); ~Graph(); int singleSize(string codeName); int getSize();// must destroy every prerequisite list connected to the node vertex* generateVertex(string codeName); int getIndexOfVertex(vertex* givenVertex); // will find the location of the vertex in the array void addVertex(vertex* newVertex); void addEdge(string codeName, string linkCodeName); void printPrerequisites(vertex* ptr, int i); bool deleteVertex(string codeName); bool deleteEdge(string codeName, string linkCodeName); bool elemExistsInGraph(string codeName); void printPrereq(string codeName); void printCourseTitle(string codeName); void printGraph(); };
I am trying to print all connected nodes within the graph using the breadth first search. Here is my code for the breadth first search algorithm that does not work.
void Graph::printPrereq(string codeName) { int adjListSize = this>adjList.size(); int index = getIndexOfVertex(generateVertex(codeName)); bool visited[this>adjList.size()]; for(int i = 0; i < adjListSize; i++) { visited[i] = false; } list<int> queue; visited[index] = true; queue.push_back(index); while(!queue.empty()) { index = queue.front(); vertex* pointer = this>adjList[index].head; cout << pointer>code; queue.pop_front(); while(pointer != nullptr){ if(!visited[getIndexOfVertex(pointer)]) { queue.push_back(getIndexOfVertex(pointer)); visited[getIndexOfVertex(pointer)] = true; } cout << pointer>code <<">"; pointer = pointer>next; } cout << "Null" << endl; } }
This algorithm outputs nodes that are only within the linked list, but not the ones that are connected through the graph.
Can anybody help and solve this problem?

Compute distance using DFS
I was torn between these two methods:
M1:
 Use adjacency list to represent graph G with vertices P and edges A
 Use DFS on G storing all the distances from p in an array d;
 Loop through d checking all entries. If some d[u] >6, return false otherwise true
M2:
 Use adjacency list to represent graph G with vertices P and edges A
 Use BFS on G storing all the distances from p in an array d;
 Loop through d checking all entries. If some d[u] >6, return false otherwise true
Both these methods will produce a worst case
O(P + A)
, therefore I think that both would be a correct answer to this question. I had chosen the DFS method, with the reasoning that with DFS you should be able to find the "outlier" of freedom degree 7 earlier than with BFS, since with BFS you would have to traverse every single Vertex until degree 7 in every case.Apparently this is wrong according to the teacher, as using DFS, you can't compute the distances. I don't understand why you wouldn't be able to compute the distances. I could have a number
n
indicating the degree of freedom I am currently at. Starting from rootp
, the child would haven = 1
. Now I storen
in arrayd
. Then I keep traversing down until no child is to be found, whileincrementing n
and storing the value in my arrayd
. Then, if the backtracking starts, the valuen
will be decremented until we find an unvisited child node of any of the visited nodes on the stack. If there is an unvisited child, increment once again, then increment until no more child is found, decrement until the next unvisited child from the stack is found...I believe that would be a way to store the distances with DFS